Science Spills Over into Congress

Congressional oversight into and investigations of pathogen spillover research begins

May 04, 2024

When I started studying pathogen spillover in 2017, I thought it would be a great way to do ecology and study pathogens without having to worry about the politics of medicine. My Princeton PhD studying “Quantitative and Computational Biology”, focused on theoretical ecology and evolutionary biology, seemed so perfectly esoteric and unreachably interdisciplinary that I envisioned myself living a life of quiet irrelevance with plenty of time to be happy & shop at REI.

Now, studies of pathogen spillover, wildlife virological research, and gain of function research of concern are all emerging topics of hot discussion, congressional investigations and oversight into the activities of researchers and science funders. Even theoretical ecology and evolutionary biology, the field combining evidence to evaluate competing theories about the origins of species and how their interactions (e.g. bats and CoVs, or human researchers and CoVs) trigger evolutionary events, is suddenly relevant for a forensic case concerning the deaths of 20 million people worldwide. In pursuit of esoteric peace, I’ve found myself at the epicenter of an historic controversy in science, and now all the esoteric squabbles and gossip and absurdities are spilling over into the broader public.

Prior to the COVID-19 pandemic, Peter Daszak was known as a shady and untrustworthy person in the field of disease ecology. We rolled our eyes at his absurd claims of being able to predict the next pandemic, even while those claims netted him millions of taxpayer dollars, because that was the name of the game in science - advertise your bold idea, and may the best salesman win. Now, as Daszak testifies before congress over his dishonest answers and pattern of deceit, there’s a dire need to bury snake oil or bat-CoV salesmen and uncover trustworthy scientists capable of providing impartial answers on the critical topic of whether SARS-CoV-2 emerged from a lab conducting dangerous gain of function research of concern on wildlife coronaviruses. Of course, who should be the experts on this topic except precisely the people doing this research? How does the public navigate the dishonesty of experts obfuscating their esoteric turf?

Daszak, as we all know, wrote a grant to the DARPA PREEMPT call in 2018 proposing to modify bat SARS-related coronaviruses in precisely the ways SARS-CoV-2 differs from wildlife bat SARS-related coronaviruses. He proposed to do this work with a variety of foreign nationals like Linfa Wang and Wuhan Institute of Virology scientists, along with another US scientist, Ralph Baric. Daszak, Baric, and the Wuhan Institute of Virology’s grant was wisely rejected by DARPA due to its risk of causing a pandemic. As someone in the epicenter, the DARPA PREEMPT grant I helped write was accepted, allowing me to develop methods for attributing pathogens to the reservoirs whence they came (including a real case study of prioritizing Nipah surveillance in Kerala, India following a Nipahvirus outbreak there). Daszak and his merry band moved on, as Daszak had other avenues of funding that were well known to people in the field, so he and his colleagues certainly had the means to continue with their DEFUSE proposal to modify a bat SARSr-CoV in a way that could very well have produced SARS-CoV-2. It would cost less than one year of a postdoc’s salary for these researchers to engineer SARS-CoV-2, so clearly this bold and terrifying idea was within their grasp.

As I watched Daszak sitting in a chair before the COVID Select committee, balding from the stress of his own deceit, sweating from the heat of the questions, and stammering in dishonest indignation, a small part of me died inside: the part of me that grew up with scientists of integrity who cared deeply about honesty, truth, and the well-being of civilization. As I read Ralph Baric’s interview, I was somewhat refreshed by what seemed like a greater degree of honesty and independence from Baric, but when Dr. Baric began talking about what matters - whether or not SARS-CoV-2 emerged from a lab, and whether or not it is consistent with a research product of DEFUSE-related work - I was saddened to see a scientist brandish their expertise and wave big words & fancy but made-up numbers around to pull the veil over the eyes of Congress, leaving them with an impression that is not accurate and does not reflect an unbiased assessment of the evidence of SARS-CoV-2 origins one gets from using numbers that are not made-up.

DEFUSE PI’s Guide to Overestimating SARSr-CoV Spillovers

For example, Baric made an argument about prior probabilities that SARS-CoV-2 emerged as a consequence of spillover versus a lab leak. To make this argument, Baric cited a paper estimating there are over 50,000 SARS-CoV spillover events annually. Dr. Baric did not mention some key details. That paper was written by DEFUSE PI’s Linfa Wang, Peter Daszak, and Shi ZhengLi, among others, so there is considerable potential for scientific deception given the conflicts of interest, and that paper didn’t actually find evidence of 50,000 spillovers a year. What did they find?

A paper by Baric’s collaborators, and the precise group of scientists under investigation for a possible lab origin of SARS-CoV-2, introduces a very significant possibility of scientific deception, of grand claims overestimating the rate of spillovers with intended effects of making people think SARS-CoV spillovers happen all the time, sowing doubt on a lab origin by precisely the line of reasoning Baric presents - if there are more spillovers every year, then our prior beliefs about SARS-CoV-2 being a spillover, all else equal, will be higher. A scientist seeking to deceive needs only estimate a sufficiently large number of spillovers to inflate-away all evidence inconsistent with a natural emergence of SARS-CoV-2, and that is precisely what seems to be happening with the 50,000 number.

What did the paper actually do, and is there any evidence of dishonesty or methods that clearly bias their estimates? How did they estimate over 60,000 spillover events a year? Bear with me here, because like Proximal Origins, a paper that immediately smelled funny for independent experts, Daszak, Linfa Wang, and Shi ZhengLi also made a rotten fish of a paper and it takes some careful scrutiny to find the source of bad smells. The researchers hid the secret sauce of their estimate underneath some fancy methods that, upon close inspection, do not support the claims of their paper and clearly overestimate the rate of spillover without transparently revealing the reliance of their estimate on bad numbers and bad assumptions.

To make it simple, the authors did the following:

Estimate bat + SARSr-CoV prevalence from field samples of bats
Estimate where bats lived
Estimate where humans overlapped with bats
Estimate human infections from bat-human interactions

The rate of spillover is then estimated as the product of these estimates - bat density, CoV prevalence in bats, bat-human overlap, and human infections given an interaction with a bat. Incidentally, this approach is a special case of the methods I developed for this problem in 2018, so I’m quite qualified to chime in on the sensitivity of this procedure to various inputs.

The first three steps above are pretty trivial and inconsequential for the main result of their paper. Nobody is arguing that bats have CoVs, that bats live in some regions and not others, and that bats live in some places where humans also live. We can estimate high prevalence of CoVs in bats, where bats live, and where humans overlap with bats without affecting the results much because these estimates are all reasonable and the main barrier to a human infection and spillover is not overlap with bats but rather virological barriers to entry: receptor binding and cell entry of a bat SARSr-CoV in a human cell, resulting in a human infection. To build intuition, when we swim in the ocean we encounter billions of viruses, yet rarely do people get infected by viruses in the ocean because viruses in the ocean can’t enter into human cells. We snuggle our dogs when they have kennel cough and we don’t get sick because that pathogen also can’t enter our cells. We play with animals all the time, we have people watching bats fly out of Carlsbad caverns, and people have been eating guano for thousands of years, yet we haven’t had any documented SARS-CoV pandemics except for the ones in 2002 and 2019, suggesting the barrier to infections and pandemics is not bat-human overlap, since overlap is common and relatively constant over history, but rather characteristics of the virus that may enable it to enter humans. Some virus variants may be more capable of making the jump, and indeed this is why the DARPA PREEMPT call sought “jump-capable quasispecies” and the preemption of this narrow range of jump-capable variants from entering humans.

So, the main crux to estimating SARS-CoV spillovers is to identify SARS-CoV cases in humans. We see with the current H5N1 outbreak that influenza cases in people can be detected fairly easily, especially when there is a large outbreak in animals, and heck we’re even able to detect these pathogens in our animals, so we have a lot of evidence that avian influenza and the bovine lineage circulating in American cattle today can enter humans due to some mix of receptor binding (the receptor influenza binds in birds and cows is slightly different, but not as different, as the human receptor) and large doses of virus to farm workers exposed to cows and poultry.

What about SARSr-CoVs? Why haven’t we seen many SARS-CoV spillovers before? How did the authors get around this absence of spillover evidence to estimate over 60,000 SARS-CoV spillovers annually?

This is where it gets a bit outrageous and one starts to gain the cynicism of a diligent scientist who realizes why most published findings are false.

Before diving into any scientific paper, it’s worth asking: how would you estimate the number of people infected with SARS-related CoVs annually? Ideally, we might randomly sample people, either PCR-tests of patients seeking care with a certain chief complaint or perhaps serosurveys providing immunological evidence of past exposure in a representative set of people in the population. Ideally, the serosurveys would be highly specific and done in a way to reduce the likelihood of false positives from other coronavirus exposures, as serosurveys can react to things that aren’t the target we’re looking for, and so we need to adjust for these false positives.

It also really does have to be a coronavirus because viruses vary markedly in their ability to infect people upon contact and the ways people come into contact with the viruses. Choosing the appropriate species for comparison is always an art of the biological sciences, but agreeable choices are found by focusing on the fundamental ecology (including molecular virology) of the species or ecological interaction of interest. Dairy farmers are being exposed to influenza because they are working with cows all day, poultry farmers are being exposed to influenza because they are working with chickens all day, and these human-animal interactions leading to influenza spillover don’t have an analog in bats because we don’t have domestic bats and influenza virology is very different from SARSr-CoVs. Nipah cases are exposed to Nipahvirus by drinking date palm sap that gets infected because fruit bats try to drink the surgery sap - this also isn’t a good analog because SARSr-CoVs are found in small, insectivorous bats that don’t contaminate human food by chugging buckets of sap all night long. MERS cases are exposed to dromedary camels by a unique sort of contact people have with camels in Saudia Arabia, again not appropriate for wild, small, nocturnal, insectivorous bats. Ebolavirus cases happen mostly due to exposures from bushmeat and other people during one of the several large outbreaks of ebola - the bushmeat angle may be more appropriate, after all SARS-CoV-1 first emerged in an animal trade network where civets served as intermediate hosts, but the virology of Ebola is very different from the virology of bat SARSr-CoVs so we need to be mindful of this limitation and ensure any serosurvey is conducted in a way that is less likely to be impacted by the many large Ebolavirus outbreaks with significant human-human transmission. All of these human ecological interactions and routes of exposure vary, and the viruses causing these cases vary markedly in their baseline ability to infect humans given contact, so I’d personally avoid using these other viruses as a comparison and instead estimate SARS-related coronavirus infections, avoiding samples they may have been infected by human-human transmission, to properly estimate the annual rate of SARS-related coronavirus spillovers.

Okay, great, so we’ve thought about how we’d do this if we were being good and honest. What did the DEFUSE PI’s do? Below is the meat of their methods, hidden in Supplementary Table 4 for most people to overlook.

They didn’t do PCR tests of clinical samples. Instead, they combined seroprevalence studies of a variety of bat viruses. The specificity of the serosurveys is unknown or somewhere from 94-100%, and with this 94% specificity test for Nipahvirus they get 3-4% seroprevalence - in other words, we really don’t know if those 3-4% seropositive cases are actually seropositive or just false positives from a test that’s not very specific. In addition to Nipah not being an ecologically appropriate comparison to SARSr-CoVs, the serosurvey with 7 positive samples out of 171 or 227 samples can’t conclude that the 7 positives aren’t the false-positives we’d expect from a test of such low specificity.

Along this same line of criticism, the researchers also sampled 199 people in China for SARSr-CoV, HKU10-CoV, HKU9-CoV, and MERS-CoV seropositivity, and despite testing 199 people for 4 different viruses they found only a flicker of two serology tests that were positive. When you run 796 tests and only 2 tests are positive, that is also within the margin of error for false-positives from serology tests that are well-known to have the limitation of imperfect specificity. I guarantee you that Daszak, Linfa Wang, and Shi ZhengLi are all aware of this limitation, yet they don’t mention it in their paper or adjust for it in their methods.

Every example of seropositive cases starts to look more suspicious the more we critically examine this table. They estimate 6.5% seropositivity for of a Malaysian virus found in fruit bats - again, very different bats ecologically & evolutionarily from those small insect-eating bats that host SARS-related CoVs - and that estimate comes from people eating fruit that was partially eaten by fruit bats, an ecological interaction that will never happen with insectivorous bats. Peter Daszak, Linfa Wang, and Shi ZhengLi et al. claim a study estimated 14% seroprevalence of ebolavirus in a 2015 study in Congo. However, if you read the actual study, the authors don’t report 14% seroprevalence - they report 0.5% seroprevalance for Marburg from 809 samples (again, inconclusive of any positives for a serology test) and a 2.5% seroprevalence for Ebola in a region that has experienced 14 ebolavirus outbreaks with human-human transmission since 1976. In other words, it’s not clear how many of the 2.5% of ebolavirus seropositive cases were actually derived from spillovers as opposed to human-human transmission, and we can’t use human-human transmission events to estimate bat-human spillovers.

The last & greatest seroprevalence is where it gets most absurd. The highest seroprevalence the DEFUSE PI’s estimate - and use in their model to estimate the rate of bat SARSr-CoV spillovers - comes from a serosurvey of SARS-CoV-2 AFTER SARS-CoV-2 caused a pandemic. Like the ebolavirus serosurvey in Congo (which the authors over-estimate by a factor of 6-7 compared to the original paper), one can’t tell what fraction of these SARS-CoV-2 seropositive samples were due to spillover from bats and what fraction of these SARS-CoV-2 cases were due to human-human transmission. I would bet nearly all my money that these 3 SARS-CoV-2 serospositive cases out of 12 samples are more likely people exposed to the virus circulating in a global human pandemic than 3 independent bat spillovers.

To recap, the authors estimates of bat SARSr-CoV spillovers come from serosurveys of many other bat viruses that spillover due to very different ecological processes (e.g. fruit dropped by fruit bats, bushmeat consumption for Ebolavirus, date palm sap consumption for Nipahvirus). The serosurvey results are a mix of either indistinguishable from a reasonable false positive rate of serology tests, over-reported compared to the literature cited without justification, or very likely due to human-human transmission like their serosurvey of SARS-CoV-2 and not due to independent bat spillover events.

There were 31 seropositive tests, total, from around 1,500 serology tests run, or 2% seropositive humans with tests whose specificity is less than 98% on bat viruses whose spillover is driven by completely different ecological interactions that SARSr-CoVs.

From these 31 seropositive tests of dubious relevance to SARSr-CoV spillover, the authors estimate 60,000 SARSr-CoV spillovers a year. If we adjusted for false positives from unspecific tests and removed viruses whose emergence is due to interactions that never happen with insectivorous microbats, the resulting estimate would be less than 1 SARS-CoV spillover a year as we have no empirical documentation of such spillovers except for one outbreak in SARS-CoV-1 and the Mojiang miners infected with a virus related to RaTG13. Careful examination of the data suggests any numbers crunched from the serosurveys above will profoundly overestimate the rate of SARSr-CoV spillovers - actual infections - in the human population every year and the truth is we don’t have evidence of 60,000 spillovers a year. That number is made up by a stack of methods tracing back to an inappropriate complication of serosurveys unadjusted for low specificity and different ecological drivers of infection.

From that paper, written by DEFUSE PI’s with significant potential for deception and, sure enough, with glaring methodological limitations buried in supplementary table S4, Ralph Baric testifies to Congress claiming that there are 50,000 spillovers a year for 20 years, so 1 million spillovers, and so therefore it’s a million times more likely SARS-CoV-2 emerged from a lab. Daszak et al. know that if they could inflate the rate of spillovers, it would lead scientists down the road Baric travelled.

Dr. Baric’s numbers are wrong. He hasn’t done due diligence to study the limitations of the numbers he used when providing what seems like expert opinion to congress but which instead is a superficial reading of the literature written by scientists with a massive conflict of interest and parroted by someone who also has every reason to willfully believe the numbers reported by his colleagues who proposed to modify bat SARS-related CoVs in Wuhan in 2018.

Baric’s testimony used overestimates of SARS-coronavirus spillover rates, published by DEFUSE PI’s without disclosing who published the paper or presenting a fair account of significant - I would argue fatal - limitations of that estimate.

As you can tell, I try to do my due diligence by carefully examining the methods AND supplemental information of papers I’m citing. Sanchez et al. (2021) claims to estimate 60,000 SARSr-CoV spillover events a year, but underneath the giant stack of methods the results derive entirely from serosurveys that don’t contain any information about SARSr-CoV spillover rates. When I see people like Baric repeating these numbers without having read the papers closely or considered limitations of the statistical methods (methods I helped develop!), repeating these claims as if they are sound, unbiased, without the potential for deception from people with the most to lose in the event of a lab origin, and predictably use these overestimates to inflate-away evidence of a lab accident, I can’t help but voice concern that this member of the National Academy of Sciences, a body established to provide impartial scientific assessments to policymakers, is not providing impartial scientific assessments to policymakers. Forgive me, but even in my position of not having any membership in any scientific society except SACNAS, the Society for the Advancement of Chicanos and Native Americans in Science, I feel a civic duty to report the numbers honestly and not play scientific telephone parroting numbers from people under investigation for likely causing a pandemic.

There’s more, too.

“Biostatistical BS”

Dr. Baric is one of the fathers of a technique called “efficient reverse genetic systems”, or methods for efficiently synthesizing RNA viruses from scratch so you can modify them later. Valentin Bruttel, Tony Van Dongen and I examined the methods people used to synthesize coronaviruses from scratch before COVID, looked at the genome of SARS-CoV-2, and came to the judgement that the “Endonuclease fingerprint indicates a synthetic origin of SARS-CoV-2”. Personally, my preferred title was that the fingerprint is “consistent with” a synthetic origin, and that’s how I’ve attempted to communicate it here and in the paper, but “indicates” was preferred by the group, it’s a fair word, and I didn’t think this was my hill to die on, so “indicates” is used the same way a canary dying in a coal mine “indicates” the presence of toxic gases but doesn’t “prove” it since canaries also die from other causes.

Anyhoo, for a pop-science recap: synthetic viruses are made by glueing together similarly-sized chunks of DNA with special cutting/pasting sites. Researchers look at a genome, add/remove cutting/pasting sites using silent mutations that change the DNA sequence to yield these similarly-sized blocks without affecting the resulting virus. The resulting viruses often have regularly spaced cutting/pasting sites left in their genome and these sites differ from closely related coronaviruses by exclusively silent mutations. SARS-CoV-2 has regularly-spaced cutting/pasting sites, like Frankenstein stitches attaching arms and legs at predictable junctures, and these cutting-pasting sites are filled with silent mutations. We examined the genomes of other coronaviruses to quantify the wild-coronavirus odds of the unusual spacing of cutting/pasting sites (1/1400 odds in wild coronaviruses) and the hotspot of silent mutations (1 in 20 million odds in wild coronaviruses). These odds are low enough that we wrote a paper documenting this pattern and contextualizing it as consistent with pre-COVID methods for making reverse genetics systems.

The BsaI/BsmBI restriction map of SARS-CoV-2 is an anomaly among wild CoVs in having equally-spaced restriction sites modified by exclusively silent mutations, and 8-9x higher rate of silent mutations within these sites compared to the rest of the genome. Such an anomalous map is consistent with a synthetic origin.

Baric was asked about our paper in his congressional testimony:

Baric had some strong opinions about our work.

First, Dr. Baric says that we wouldn’t expect to find these sites present in other bat strains. However, below is the last reverse genetics system made by the Wuhan Institute of Virology, rWIV1 - they used several pre-existing sites (4387, 12079, and 27352) to make their infectious clone, otherwise they knocked out one site (1571) and added four more (8032, 10561, 17017, and 22468). Reverse genetic systems use the pre-existing restriction map and modify it minimally to create a suitable product. For SARS-CoV-2, with the enzymes BsaI and BsmBI, the hypothesized progenitor likely had the highly conserved restriction sites, most CoVs have too many BsaI and BsmBI sites that prohibit efficient synthesis, and in our theory the researchers removed a few of them with silent mutations to generate the pattern observed in SARS-CoV-2.

Baric says we wouldn’t expect to find pre-existing sites in the genome, but for the last infectious clone published by the Wuhan Institute of Virology pre-COVID they left in many of the pre-existing restriction sites in the genome.

Baric claimed we wouldn’t expect to find these sites in other CoVs, but prior work contradicts his claim. Baric went on:

Baric claims the smallest fragment is too small for his comfort. He says it is about 300 base pairs. In reality, it’s 652 base-pairs, over twice as long as Baric claims. Baric then says he wouldn’t make a clone like that, it would irritate him. This is an argument akin to seeing a drawing of a stick figure and saying it couldn’t have been drawn by a human because the disproportionate arms or unequally sized legs would irritate you. However, more empirically, look back at the rWIV1 genome - that contained a very short segment, segment C2, and segment C2 was 1500 base pairs long, admittedly longer than our segment but small segments are manageable, especially if they contain regions of the genome you don’t intend to tinker with so they can be used as a final link to construct the full virus. Baric also claims the first segment is too small, but the first segment is 2,188 basepairs long, longer than rWIV1’s fragment C2 and almost as long as rWIV1’s fragment C1.

When evaluating whether/not a particular genome is a research-related product, it helps to evaluate prior work and determine if this would help researchers accomplish stated aims. In other words, suppose this was a research-related product, what could you do with it? Does it make some kinds of work easy and other kinds of work hard or impossible? In rWIV1, the researchers didn’t initially make that segment C2 until they realized segment C was toxic to bacteria when they tried to mass-produce it, so they had to cut segment C into two pieces in order to fulfill their experimental purposes. In DEFUSE, researchers wanted to swap Spike genes and insert edits, like furin cleavage sites, inside the spike gene. Could the restriction map in SARS-CoV-2 permit such work?

In prior work by familiar names Ben Hu, Linfa Wang, Peter Daszak, Shi ZhengLi et al. (2017), researchers used the restriction enzymes BsaI and BsmBI to swap spike genes. Hu et al. (2017) was the only time pre-COVID when researchers used this pair of restriction enzymes - BsaI and BsmBI - on a coronavirus infectious clone, and incidentally these are the exact two restriction enzymes for which we find the anomalous spacing of restriction sites AND the hotspot of silent mutations in SARS-CoV-2. The restriction map of SARS-CoV-2 would allow the researchers to swap Spike genes and insert furin cleavage sites using the exact same methods they used in 2017. Additionally, the small segment is the only segment flanked by different enzymes - all other segments can be flanked by exclusively BsmBI or BsaI, simplifying digestions and enabling the same insertion methods used by these authors in 2017. Heck, the authors could use the exact same Spike genes flanked by BsmBI used in 2017 to replicate their study on a new infectious clone - this reverse genetics system in SARS-CoV-2 is perfectly suited for their research program.

Baric’s testimony to congress on the topic of our research involved him using made-up numbers (300bp) and subjective claims (an irritating small fragment) in an attempt to rebut our paper’s 1/1400 anomaly of a strange pattern of fragment lengths. Like many others, he avoids commenting on our 1 in 20 million anomaly of hotspots of silent mutations in these same cutting/pasting sites used by DEFUSE PI’s in 2017 which generate the anomalous fragment lengths in SARS-CoV-2. The silent mutation pattern is an essential piece of the puzzle as it is a far more significant result and one can’t explain how we got so lucky to find so many silent mutations by focusing on these restriction sites yielding a pattern of regularly-spaces sites that looks artificial and statistically is anomalous among coronaviruses.

Baric called our work “biostatistical BS”, but our numbers were estimated empirically with wild coronavirus genomes, standard methods, and reproducible code. If there was any biostatistical BS, it may be Daszak et al. hiding bad serosurveys in supplemental table S4, Baric citing their 60,000 spillovers annually without due diligence, and Baric’s own “BS”, for lack of a better word, bullshitting on the actual empirical numbers of fragment lengths in relation to prior work or BSing that a fragment being irritating to Baric implies the reverse genetics system wouldn’t be useful for the research programs underway in Wuhan.

When Scientists Mislead Congress

Congressional oversight committees are currently investigating a very serious matter of the likely research-related origin of SARS-CoV-2 that may be a result of research funded by the US taxpayer through Daszak’s EcoHealth Alliance subcontracts to the Wuhan Institute of Virology. I have to emphasize every time I discuss this that 1 million Americans are dead. 20 million people worldwide are dead. This is not a laughing matter, this is not the time for ego and mediocrity and scientific bullshit. The existence of many pieces of evidence pointing towards a research-related origin all triangulate to the collaboration between Peter Daszak, Linfa Wang, and Shi ZhengLi. How curious, and unfortunate, that it is scientific estimates by these same serially conflicted and untruthful researchers that Baric is relying on for his own estimation that a lab origin is unlikely. Of course a research-related accident should involve researchers, and those researchers continue to obfuscate the science by publishing papers that mislead the world about the facts of the matter. Their expertise, our journals, and the media’s trust in experts following a pandemic, are all being weaponized to mislead the world.

A part of me dies on the inside when I see these scientists mislead members of congress with fraudulent numbers. Numbers are the heart and soul of science, the reproducible units of measurement we must communicate faithfully to ensure others can compare their findings to ours.

A part of me dies on the inside when bad numbers parroted to congress and other managers representing the will of the people were published in a Nature journal, a conglomerate of scientific narrative-manufacturing journals that receives a significant amount of its revenues from China, a subsidiary of Elsevier, another company that receives a significant amount of its revenues from China, a subsidiary of RELX Corp, another company that receives a significant amount of its revenues from China and employs former Chinese government officials in its upper ranks. The core institutions we rely on for science, for communicating numbers, did not seem to read the numbers in supplementary table S4 or force the authors to evaluate the suitability of their estimates. These same journals refuse to publish articles popularizing evidence consistent with a lab origin.

A small band of scientists may have caused a pandemic, and they are using science - numbers and estimates and their own expertise granting authority to comment on methods - and science institutions like our journals and Academies to sow doubt in the potential roles of their colleagues and their funders in this research-related accident. By not resisting such abuses of science and scientific institutions, by not combating such unethical behavior, many academic virologists are increasing the distrust of their discipline, raising the stakes of the issue by increasing the collateral damage this small band of researchers and their funders will cause.

A part of me dies on the inside because I became a scientist precisely to cut through bullshit and arrive at the truth, and I thought our institutions were designed to support that, I thought other scientists were courageous enough to speak up, yet here are scientists bullshitting in congress, obscuring the truth with bad science, publishing bad numbers in big journals, and the majority of other scientists have gone silent in a pandemic of scientific cowardice.

The truth is that we don’t have reliable estimates of SARS-related coronavirus spillovers. The truth is the absence of prior pandemics suggests that some combination of a low rate of spillovers and/or low odds of highly transmissible SARS-related coronaviruses like SARS-CoV-2.

SARS-CoV-2 is an anomaly and we have no evidence to suggest that SARS-related coronaviruses spillover regularly. The only well-documented SARS-related coronavirus spillover we observed before COVID was SARS-CoV-1, an animal trade outbreak resulting in many spillover events over a geographically broad animal trade network, with both contact tracing and serosurveys identifying early infections concentrated not just in animal handlers but in civet handlers specifically, with 25 animals sampled and 7 testing positive (mostly civets) with progenitors 99% similar to the virus found in humans. All pieces of evidence telling a consistent story for SARS-CoV-1 emergence were collected without requiring a precedent, because is easy to trace SARS-related coronavirus outbreaks, like other zoonoses, to their source with modern knowledge and methods.

Since SARS-CoV-1, there were at least 6 lab accidents in China, so of the 7 prior documented SARSr-CoV emergence events only 1 was a spillover event due to an animal trade outbreak and 6 were lab accidents. We do not have data otherwise - the 60,000 spillover events mentioned by Baric never happened, they are nebulous numbers conjured into print by a stack of methods built on a hidden, rotten foundation of unadjusted false-positive SARS-CoV-2 serosurveys, Nipah serosurveys, ebolavirus serosurveys in regions with human-human transmission and published seropositive rates far less than those used under the hood in models by Daszak, Wang, and ZhengLi.

Bad Scientists Undermine Science

Congress and other investigators desperately need honest quantitative biologists, ideally those with knowledge of ecology and evolutionary biology, molecular biology, mathematical modeling, and statistical methods used to study pathogen spillover. Sadly, such scientists are rare. I was in the first class of Princeton’s Quantitative and Computational Biology program, I was the first in my class to graduate, and I am the only one I know of who also studied pathogen spillover. Quantitative literacy is rare in biology because biology, historically, has been a discipline engaged in field work - catching bats, surveying elephants - and lab work - making buffers, aliquoting samples, designing primers, etc. It’s not common for someone to know the molecular methods for bioengineering, the protocols for epidemiological estimates of disease incidence (e.g. bat SARSr-CoV spillover), field methods for sampling bats, evolutionary methods for estimating the evolution of furin cleavage sites, and forensic statistical methods for evaluating competing theories.

From my rather lonely vantage point of interdisciplinary excellence applied to a controversial issue, I’ve looked down the mountain to see powerful scientists desperately clawing their way to my perch, trying & failing to discredit our work. In their efforts to discredit fair work and amplify bad work, we’re witnessing a very dangerous pattern of scientists abandoning the objectivity, honesty, and humility that motivates trust in science. We’re seeing scientists abandon their civic duty to provide impartial consultations to managers like Congressional representatives. Baric just made up small numbers to congress regarding the restriction map of SARS-CoV-2 when smaller numbers boosted his arguments, and he utilized made-up bigger numbers from Daszak and ZhengLi without indicating where those numbers came from because the bigger numbers boosted his arguments then. The obvious effect of allowing scientists to play fast and loose with numbers is that the true numbers for estimating the likelihood of a lab accident will be obscured, the public unfamiliar with scientists’ methods won’t be able to tell which numbers are right, and doubt will fester where greater certainty ought to be.

Science has always had its snake-oil salesmen and ludicrous arguments. Daszak was a well-known snake-oil or bat-soup salesman pre-COVID, peddling oversold arguments that he could predict the next pandemic to secure millions in PREDICT funding, that sampling random animals all around the world will make us safter to secure millions in CEPI’s Global Virome Project funding, that SARS-related coronaviruses are poised for emergence to secure millions in NIH/NIAID funding. Pre-COVID, we all rolled our eyes at the peddlers although some, like me, felt a civic duty to do the back-pedaling and counter absurd claims or unfounded theories. When half of science is peddling and the other half is back-pedaling, science comes to a halt and the millions dollars wasted as they are granted to unworthy recipients with bad ideas based on bad statistics, bad logic, and bad faith.

Scientists everywhere need to take the issue of COVID origins far more seriously and start doing their part to be far more objective, far more excellent, and far more humble to distance ourselves from the abomination of science on parade in front of Congress these days. Our scientific institutions, their credibility and their funding, rely on our objectivity. The list of transgressions from famous scientists is growing longer and their grifting is growing more visible, posing a serious threat to science and our society. There is no Anti-Science movement, the greatest threat to science is from within. We dishonest scientists to sulk into obscurity so that more ethical scientists can rise in prominence. We need to show the world what good science looks and sounds like.

Kristian Andersen and Eddie Holmes published a paper saying a lab origin is “implausible” when Andersen believed it was “so friggin likely”, failing to acknowledge that the funders of dangerous coronavirus work in Wuhan prompted, edited, and promoted their work. When testifying under oath, Andersen claimed he didn’t have an NIH/NIAID grant under Fauci’s review, yet he did - Fauci could’ve rejected Andersen’s grant but instead, after Andersen published a paper claiming a lab origin from a lab Fauci funded is “implausible”, Fauci gave Andersen millions of dollars in NIAID funding.

That behavior undermines trust in science.

Fauci parroted Andersen et al’s paper on national television without disclosing the funding his agency provided to Wuhan, his role in prompting the paper, all while pretending he didn’t know who the authors were. Fauci then lied under oath that he never funded gain of function research of concern in Wuhan, yet now we have receipts that NIH provided gain of function funding waivers to Ralph Baric to study chimeric WIV coronavirus constructs, NIAID is listed as a funder of Ben Hu et al’s 2017 research making unnatural coronavirus chimeras with the goal of finding something more infectious, and even Ralph Baric confessed to Congress that Daszak’s 2018/2019 progress report to NIAID on coronavirus work in Wuhan was gain of function research of concern.

That behavior undermines trust in science.

Daszak withheld DEFUSE when a virus looking like a DEFUSE research product emerged in Wuhan, the same place he planned to make such a virus. When was appointed to be the US emissary to the WHO investigation, or lead the Lancet COVID Origins investigation, or contribute to the National Academy of Science’s letter to OSTP claiming a lab origin is implausible, Daszak did not disclose DEFUSE but, instead, seems to have picked all his friends to vote alongside him in these scientific committees and reports. Daszak lied to the US government about the risks of his research and he lied to Congress about his plans to conduct this work in Wuhan.

That behavior undermines trust in science.

I could go on, but the point is that I care a lot about science and the biggest threat I see facing science as it spills over into congressional investigations is that many prominent scientists have been dishonest and unethical without consequence, and that needs to change. I care so much about science that I’d rather be the one who tells the world my work is wrong than let the world believe incorrect science is right, while these people would rather peddle lies to protect their reputations even if it undermines all of science. A part of me dies inside when I see scientists undermining public trust in science - ironically, all while parroting claims that their detractors are “Anti-Science” (as Peter Hotez does, without disclosing that he, too, was subcontracting risky virological work to the Wuhan Institute of Virology)! I’ve never seen such an abomination of science before in my life, the festering rot of bioscience grift enabled under Fauci’s tenure at NIAID is now being exposed to light, and that light may reveal weaknesses in the foundations of science funding, publication, and means of career advancement leading to the selection of peddlers at the expense of honest back-pedalers. A small number of highly conflicted scientists are abusing science, their appointments to scientific positions of power, their credibility as experts, and their publications in journals with the clear intention and effect of misleading the world about the probable lab origin of SARS-CoV-2.

Science has always been an epistemological warzone with ground rules, but with COVID origins it seems many of the ground rules have been abandoned. Scientists are publishing bullshit about an “implausible” lab origin, that lab origin theories are “conspiracy theories”, that there are “60,000” SARS-related coronavirus spillovers annually, a Wet market outbreak as “dispositive” evidence of a natural origin, buggy code claiming two branches in the SARS-CoV-2 evolutionary tree is evidence of two spillovers, a single read of SARS-CoV-2 among 200,000,000 reads (a minute fraction of which were raccoon dogs) hailed in The Atlantic as “the strongest evidence yet” of a natural origin, and more. The festering abomination of science behind why most published findings are false is spilling over into Congress, and in the process the arrogance of a small number of extremely vocal and powerful yet heavily conflicted scientists is doing immense harm to the reputation of academic science.

I refuse to participate in such a system. I’m doing everything in my power to counter bad science in this field. That’s why I read Ralph Baric’s arguments and evaluated them closely with sharp pencils to ensure his numbers add up and his probabilities multiply appropriately. That’s why I read Proximal Origin, Worobey et al., Pekar et al., Crits-Cristoph + Debarre et al., Daszak et al., and other papers first with an open mind and then, after excusing myself to vomit and cry a little, with a desire to back-pedal.

At some point, we need the scientists doing the back-pedaling - often without National Academy posts, NIH/NIAID connections, or alignment with the profit motives of Elsevier - to be given the full opportunity to write the science they see and tell the science as it is without having to be filtered through the congressional testimony of the peddlers. If only Congress could hear what science really sounds like, what careful examination and impartial judgements form qualified experts in the field look like, if only they could find an unbiased scientific consultant eager to help them arrive at the correct answers in this epistemological warzone, we can rescue the credibility of science and apply the necessary intelligent heat to the unethical scientists, funders, publishers, and other scientific bodies that have abandoned their civic duty to help society learn the truth.

It sucks to watch a more mature scientist give a congressional testimony full of rudimentary mistakes and evidently superficial grasps of the data and probabilistic methods for theoretical reasoning, as Ralph Baric did, and it sucks to see lies from Peter Daszak permeate discourse. It sucks that when I have other things I’d like to do with my time to aid civilization I find myself defending our findings indirectly, bickering with scientists’ congressional testimony through my Substack because journals are too conflicted to publish scientists’ competing views and Democrats on the COVID select committee appear to be successfully misled on the evidence & sound methods pointing to a lab origin.

More than anything, it sucks to spend my whole life trying to be the best scientist I could be, only to learn that NIAID prefers foot soldiers and fools willing to peddle lies to cover up the obvious truth that NIAID funded gain of function research of concern in Wuhan, that such research may have caused a pandemic (or this may have been a PLA project and scientists are providing cover fire nonetheless). It sucks that scientists as a whole are not rising up to defend the truth, but instead the systems of power in modern science seem to have interests of their own. The US will continue to fund the health sciences, so even if NIAID is reformed science will go on, but we have an obligation to ensure the science that goes on is a safe and efficient use of tax dollars.

As science spills over into congress, I’m disappointed that the world gets to see this modern state of science, where most published findings are false, where risks are mismanaged, where funders like Fauci, Collins, and Farrar are Popes able to label inconvenient theories disinformation with the backing of US government censorship, where scientists make up numbers and other scientists parrot their numbers without understanding how they were calculated, or what the true numbers are.

Many scientists bemoan disinformation, but few critically examine the quality of information coming out of scientists. We to clean up our scientific system before casting stones. If most publish findings are false, then why do we fund science? Why don’t we find meta science for a few decades first to develop better ways to ensure scientists publish the truth & funders manage risk + fund productive ideas.

One hopes the “good guys” win in the end, but that is never a given. If we want the good guys to win and if we want science to be all it can be for society, we need to push back against dishonest grifters like Daszak, bad numbers from Baric, publication biases in Elsevier, funding biases in NIAID, excessive influence in science from leading health science funders, and all the other social malignancies that undermine science.

AJB

May 4

Part of me also dies when I see all the willful blindness and the societal enthusiasm for somnambulant "groupthink" which is so pervasive today. Thankfully we have brave, coherent experts like you who remain devoted to real science and Veritas – no matter what – and who retain the grace to share it via these wonderful Substacks. Without devotion to facts, data and critical analysis, science falls away and takes civilization down with it. If we do survive as a civilization and a species, it will be because of intellectual work like this.

Expand full comment

John Wilby

I noted your last comment lamenting that it would be great to have an ethical scientist present in front of Congress. Do you think they really want that? It always makes me wonder how many in Congress are influenced by money from Big Pharma and others who would benefit from obfuscation.

25 more comments...

A Biologist's Guide to Life

Discussion about this post