A team of bioinformaticians, biophysicists, and medical professionals released a preprint uncovering an unreported coronavirus infectious clone in Wuhan.
For the past few years, Adrian Jones, Daoyu Zhang, Louis Nemzer, Steven Massey, Yuri Deigin, and Steven Quay have become a dream team of researchers committed to examining online databases and other sources of evidence for clues about the origins of SARS-CoV-2. These authors are experts on many topics, a synergy of interdisciplinary backgrounds enabling their team to cover more than any one individually could, and they have published several leading forensic investigations of interest, most notably their forensic analysis finding evidence of potential fabrication of coronavirus genomes. In their latest preprint, the authors have uncovered shocking evidence of an unreported lab-derived coronavirus in Wuhan.
What did they do, what the heck does that mean, and why does it matter?
What they did
Thanks for reading A Biologist's Guide to Life! Subscribe for free to receive new posts and support my work.
The researchers looked closely at data from a 2019 study sequencing rice in Wuhan, China, and found an evidently lab-made coronavirus lurking in the background.
When we sequence the genetic material of any organism, such as rice in Wuhan, we often use what’s called “shotgun” sequencing. Rather than reading every single A, T, G, and C in one long, drawn-out effort, we instead simultaneously read a whole bunch of short snippets, called “reads”. If we were sequencing reads of the alphabet, we might get a dataset that looks like this:
From that dataset, we can align e.g. “BCDEF” to "ABCDEF”. We can then "pair the ends”, such as the “…EFG” and “EFG…”, to stitch together the entire alphabet from the “sequence read archive” above.
To make that read archive from a biological sample, however, we need to use a sequencing machine, and in order to use a sequencing machine we often need to use a whole suite of laboratory stuff, such as test tubes or ‘wells’ in which we deposit samples for sequencing. If a test tube or well is contaminated, possibly by not being washed well enough after the last use, then the sequencing machine can pick up additional DNA.
Suppose that we wanted to sequence the alphabet, but the last person to use the machine sequenced the word “CORONAVIRUS” and didn’t clean the test tubes well enough, so some CORONAVIRUS remained stuck to the walls of the test tube like food on a plate. With this contaminant lining the walls of the test tube, the sequencing machine might pick up some of the contaminant and return the following dataset with contaminant sequences highlighted in bold:
If we paired the ends of reads in our dataset above, we would still get the alphabet as one contiguous sequence and we’d get a second contiguous sequence from the contamination: CORONAVIRUS.
In the pre-print, the authors looked at a sequence read archive from researchers studying rice in Wuhan (the alphabet in our example above) and found evidence of contamination by a coronavirus genome.
Coronaviruses don’t infect rice, so the coronavirus didn’t come from rice. Coronaviruses do infect people, so the first hypothesis is that a person sampling rice could’ve sneezed and deposited the coronavirus. However, two pieces of evidence suggest this particular coronavirus did not come from people. First, the closest relatives of the coronavirus are a clade of bat coronaviruses documented by researchers in the Wuhan Institute of Virology & their international colleagues at EcoHealth Alliance. Second, the coronavirus genome is attached to something that only lab-made coronaviruses are attached to. By “attached”, I mean literally - the coronavirus genome sequence is contained within a longer sequence. Imagine if the word CORONAVIRUS in our example above were not alone, but instead was part of a longer sequence: WUHAN_CORONAVIRUS_CLONE. When the reads were all paired up in the rice dataset, Jones et al. found the coronavirus genome was contained inside something called a “bacterial artificial chromosome” or BAC. Coronaviruses never exist in BACs in the wild as BACs are creepy lab-made chimeras, bacterial chromosomes containing a bat virus in this case. BACs are used to clone coronaviruses in the lab, allowing researchers to genetically modify their virus, make an infectious clone of that virus, and test properties of that infectious clone, such as how well it can enter human cells or infect humanized mice.
TLDR: Researchers in Wuhan sequenced some rice sometime in late 2019 or early 2020. Jones et al. discovered the rice dataset contained a bat coronavirus lurking in the background. That coronavirus was contained in a BAC clone, suggesting someone in Wuhan, or using the same sequencing facilities or materials as this group in Wuhan, was doing important experiments with bat coronaviruses in BAC clones.
What that means
Many pieces of evidence prior to this work suggest SARS-CoV-2 most likely came from a lab. This paper finds additional evidence that researchers studying coronaviruses in Wuhan have not shared the full truth about the extent of their coronavirus research activities.
The BAC clone has particular relevance for the probable lab origin of SARS-CoV-2, as it connects with another piece of evidence. One peculiar feature of the SARS-CoV-2 genome is its unusually regular cutting & pasting sites as documented in a paper Valentin Bruttel, Tony VanDongen, and I co-authored. In order to make coronaviruses in the lab from a sequence, researchers have to “glue” together the sequence in blocks, and then they often insert that reconstructed, full-length clone inside a BAC, exactly like the one discovered in the rice dataset. Our findings led to the synthetic origin theory that SARS-CoV-2 might have come to life via the immaculate conception of BAC cloning in a laboratory. It’s curious, then, that Jones et al. found this unpublished bat coronavirus from Wuhan docked inside a BAC clone.
Some researchers have claimed that SARS-CoV-2 could not have been synthesized in a BAC clone because SARS-CoV-2 isn’t anything like any previously published backbone. In the (in)famous “Proximal Origin” paper, Kristian Andersen et al. claimed
”… if genetic manipulation had been performed, one of the several reverse-genetic systems available for betacoronaviruses would probably have been used. However, the genetic data irrefutably show that SARS-CoV-2 is not derived from any previously used virus backbone.”
In other words, the preeminent zoonotic origin proponents claimed that because SARS-CoV-2 is not like any other published virus, it could not have been derived from this kind of BAC clone. This, of course, rests on the assumption that coronavirus researchers published everything they knew about bat coronaviruses by 2020. In January 2020, the Wuhan Institue of Virology released a sequence, RaTG13, and claimed was the closest relative to SARS-CoV-2 in their possession. Researchers said SARS-CoV-2 could not have been derived from that sequence, either, so there’s no way SARS-CoV-2 could have been made by BAC cloning.
Again: what if the researchers studying coronaviruses in Wuhan were not truthful? After all, if someone did create a virus that caused a pandemic, it’s likely their first instinct would be to not confess to doing so. It would undermine the claims of the Wuhan coronavirus researchers, and the preeminent zoonotic origin proponents, if one were to find unpublished coronaviruses, especially bat coronaviruses in BAC clones. Such withheld coronaviruses could provide a glimpse into possible gain of function work on coronaviruses taking place in Wuhan. Jones et al. found an unpublished (withheld?) BAC clone of a bat coronavirus in Wuhan, an indisputable unpublished laboratory construct. What does this presumably withheld genome tell us about the coronavirus research in Wuhan?
The closest relatives of this coronavirus had been sequenced by the Wuhan Institute of Virology and EcoHealth Alliance’s Peter Daszak, the exact groups at the center of the dominant scenarios for a laboratory origin of SARS-CoV-2. When we look closely at the coronavirus genome Jones et al. uncovered, there is some evidence this virus may be the same kind of Frankenstein or “chimeric” virus as SARS-CoV-2, as a part of its Spike gene appears more like a MERS coronavirus while the rest of the genome looks like that aforementioned clade of bat coronaviruses. It looks as if researchers studying bat coronaviruses in Wuhan in 2019, incidentally coronaviruses closely related to those published by Daszak + Shi Zhengli, were conducting gain of function research by adding segments of Spike genes from a MERS coronavirus, complete with two potential furin cleavage sites, into an unpublished bat coronavirus backbone whose ancestor likely had no furin cleavage site.
If these findings hold, they suggest that in Wuhan, likely in late 2019, there was at least one unpublished coronavirus fully sequenced by a lab and docked inside of a BAC, waiting to be cloned, exactly as Bruttel, VanDongen and myself & others theorize for the synthetic origin of SARS-CoV-2. This bat coronavirus uncovered by Jones et al. appears to be a recombinant virus, much like SARS-CoV-2. It contains parts of a Spike gene from another coronavirus, much like SARS-CoV-2. Specifically, the recombination involved the exact parts of the Spike gene containing a furin cleavage site, much like SARS-CoV-2. It looks as if the exact research many hypothesize led to the creation of SARS-CoV-2 was conducted on an unpublished bat coronavirus in Wuhan, and, rather than being disclosed, that bat coronavirus had to be uncovered by sleuths as a contaminant in a rice sequencing project.
Why it matters
For their entire evolutionary history spanning over 1,000 years prior to the COVID-19 pandemic, SARS coronaviruses lacked furin cleavage sites. Furin cleavage sites were, however, of great interest prior to COVID-19 as studies from other viruses (e.g. respiratory syncytial virus or MERS-CoV) showed that furin cleavage sites can make it easier for viruses to enter cells, potentially enabling a virus that typically infects bats to be better able to infect human cells, for example. For this reason, and with this knowledge, in 2018, Peter Daszak at EcoHealth Alliance, Shi Zhengli at the Wuhan Institute of Virology, and colleagues wrote the DEFUSE grant proposing to insert furin cleavage sites inside bat coronavirus infectious clones at Wuhan. We can see over 1,000 years of SARS coronavirus evolution in the SARS-CoV evolutionary tree, and we never saw a single SARS-CoV with a furin cleavage site until SARS-CoV-2 in 2020, just 1.5 years after Daszak + Zhengli et al. proposed to insert one inside a SARS-CoV infectious clone in Wuhan.
It’s additionally weird because the DEFUSE grant was not willingly shared by Daszak nor the Wuhan Institute of Virology; it had to be obtained & released by a third party against the will of the researchers.
Now it’s extra weird, because Jones et al. have just found a bat coronavirus sitting in a BAC clone in Wuhan with a potential furin cleavage site introduced from hypothesized recombination with MERS-CoV. The bat coronavirus is close relatives to those uncovered by PI’s of the DEFUSE grant; it was not disclosed by researchers in Wuhan, it had to be obtained & released by Jones et al. thanks to their forensic investigations of a rice sequencing project. While this bat coronavirus is not technically a SARS coronavirus, it is part of another clade that, like SARS-CoVs, has been extensively studied by Daszak and Zhengli is of interest for anyone looking to recombine furin cleavage sites. What a major coincidence that we haven’t seen a furin cleavage site in SARS-CoVs for 1,000 years, and now we find one in SARS-CoV-2 in Wuhan, and we also we find in Wuhan an unpublished BAC clone of a coronavirus with a furin cleavage site, a coronavirus whose closest relative was found by the authors of the grant proposing to insert furin cleavage sites in SARS-CoVs in 2018.
Like any good paper, Jones et al. raises a lot of critical questions for further investigation. Are there any other coronavirus BAC clones lurking in the background of our sequence read archives? Did the PI’s of DEFUSE have any knowledge of this unpublished backbone, or was there someone else catching bats in their same caves, utilizing their same molecular methods, and examining their same questions by making chimeric coronaviruses in the same city in China? What other coronaviruses sat inside BAC clones in Wuhan in late 2019? Why wasn’t this research released immediately in the interest of international transparency as we tried to understand this novel virus with a furin cleavage site and the world stared down the barrel of an oncoming pandemic?
Why did Jones et al. have to go sleuthing through an old rice dataset to find this coronavirus infectious clone in Wuhan?
Back in 2019, Daszak and the Wuhan Institute of Virology boasted about having hundreds of SARS CoVs, complete with their full genomes on a database in Wuhan, and he alluded to the fact that some were being experimented with. Yet, by 2023, the most important CoV database in the world, a database so small it could fit on a thumb drive or an email, has not been shared. Instead, the database of coronaviruses at the WIV was taken offline in the fall of 2019, so we can’t confirm that coronavirus BAC clone lurking behind a rice genome is not one of theirs. Jones et al. have uncovered one of possibly hundreds of unreported coronaviruses known to labs in Wuhan, and this one coronavirus looks like a bat coronavirus with one or maybe two furin cleavage sites, a topic of intense interest to Daszak, Zhengli et al.
Why haven’t Daszak and the WIV shared their database with the world? Is this one of their coronaviruses? What other unreported coronaviruses existed in Wuhan labs in 2019?
If the findings of Jones et al. hold, they suggest there is at least one set of researchers interested in furin cleavage sites in bat coronaviruses in Wuhan and not disclosing the coronaviruses in their possession to the world at a time when the world desperately wants to know how 18 million of our friends & family members died from a bat coronavirus that emerged in Wuhan.
The predictable discovery of suspicious coronavirus genomes like this one is why myself and others are calling for transparency from all coronavirus researchers, especially those studying coronaviruses at Wuhan or in collaboration with Wuhan labs. Transparency can help us rule out the involvement of various labs by ensuring they did not have the progenitor. Of course, while transparency clears any lab that did not cause a pandemic, transparency is a guilty verdict for any lab that did cause a pandemic, so we have good reasons to be very skeptical of any lab refusing to share their data. If the world had a full list of coronaviruses studied pre-COVID, the finding of Jones et al. wouldn’t be a surprise. However, instead of having a good sense of all coronaviruses being studied & the research conducted in Wuhan, Jones et al. find this unreported coronavirus lurking suspiciously as a contaminant in a sequence read archive of researchers sequencing rice.
Someone is not telling us the full truth about coronavirus research activities in Wuhan. We are surprised by this novel, unpublished backbone because this information, known to some lab in 2019, is news to us in 2023. It is news to us in 2023 because some lab somewhere in Wuhan didn’t share it with the world in 2020.
We’re wise to use adverse inference. The lab probably didn’t share this information because they created SARS-CoV-2 by inserting a furin cleavage site in a bat coronavirus docked inside a BAC clone, and sharing their data would reveal their guilt. The virus created from their good-intentioned but extremely risky experiment infected some human and proceeded to cause a pandemic resulting in over 18 million deaths. They don’t want to be held accountable for 18 million deaths, so they don’t share their database which includes the backbone to SARS-CoV-2, as well as this unpublished backbone uncovered by Jones et al., and perhaps many more genetically modified viruses that don’t exist in nature and even to this day may pose a major threat to global health as great or greater than the SARS-CoV-2 pandemic. The synthetic origin theory can explain the BAC clone in Wuhan, the lack of evidence for a zoonotic origin, the stack evidence providing probable cause for a lab origin, the dataset the WIV refuses to share, and more.
Jones et al. found a coronavirus BAC clone needle in a haystack of sequence read archives, a BAC clone that should have been disclosed earlier and voluntarily by researchers in Wuhan at the start of the pandemic. The clear and compelling evidence they’ve uncovered provides evidence of unreported, unpublished coronavirus infectious clones in Wuhan at the start of the pandemic, casting serious doubt on the claims that coronavirus researchers in Wuhan were not conducting gain of function research on bat SARS-CoV infectious clones.
On the bright side, Jones et al. are courageous scientists staying true to their scientific roots. Their sharp-penciled examinations show how careful, forensic analyses of sequence read archives can uncover critical evidence of bioscience research activities that may be of interest to the world, such as unreported work on potentially pandemic pathogens in the same city & immediately prior to a pandemic caused by very similar pathogens.
Thanks for reading A Biologist's Guide to Life! Subscribe for free to receive new posts and support my work.
Thank you for the hard work you do Alex. I'm just a lay person trying to understand so I follow along. I admire your due diligence and determination to seek truth.
I don't know the "truth" about the origins. Yet, I read all the "sides" and try to make sense of what I can. I settle on this quote, based upon the behavior of key players regarding this topic, "all the lies people tell just to cover, cover up the truth that reveals itself in the end" Something has always struck me about Alex. He isn't lying. Just a hunch.