Open letter to the world
A follow-up on our analysis of the anomalous restriction map of SARS-CoV-2
Drs. Valentin Bruttel, Tony VanDongen and I wrote a preprint finding evidence of a synthetic origin of SARS-CoV-2. The online discussion of our paper ranged from uncritical support to critical consideration to every manner of insult and accusation, including ableism, scientists claiming I don’t have a PhD, and more. This letter aims to clarify who I am, what colleagues and I have done in the preprint, and what my intentions are.
Thanks for reading A Biologist's Guide to Life! Subscribe for free to receive new posts and support my work.
Who I am
One of the critiques proliferating on Twitter seems to be that the co-authors and I don’t know what we’re talking about. I aim to address those critiques here.
I grew up with a profound hearing loss in underfunded public schools on the school-to-prison pipeline of Albuquerque. My mom is a brilliant, imaginative molecular biologist, so I had the privilege of learning about biology from an award-winning mentor for my entire life.
In college, I studied mathematics and biology. I did undergraduate research in many fields - isotope ecology with Blair Wolf, protein evolution and biochemistry with Gregory Petsko and Dagmar Ringe, comparative immunology with Sam Loker, and epidemiological modelling with Helen Wearing. I graduated summa cum laude with two degrees and an uncommonly diverse undergraduate research career. I received a prestigious NSF Graduate Research Fellowship to complete my PhD at Princeton University’s new program for Quantitative and Computational Biology under the mentorship of Simon Levin. My thesis was on mathematical and statistical models of competition across ecological, social, and economic systems.
My first postdoc was on microbiome epidemiology & evolution with Diana Nemergut at Duke University. Diana passed away a few months into the postdoc and so I did independent research at Duke developing new tools to analyze microbiome datasets. I also started consulting a quant hedge fund, doing data analysis and trading strategy development more closely related to my PhD work.
My second postdoc, starting March 2017, was on wildlife virology, epidemiology, and pathogen spillover of Henipaviruses from bats to people with Raina Plowright at Montana State University. Together with many collaborators, we obtained a DARPA PREEMPT grant. I was listed as a PI and wrote a section of the grant on phylodynamics, but Montana State University didn’t allow postdocs to be PI’s, so I was dropped from that title on a technicality. I was appointed to a research scientist position at MSU and worked on a diverse set of projects including, models of pathogen spillover, analyses of which pathogens are likely to spillover from which mammals, and more. During this postdoc, I continued to consult the quant fund and additionally began consulting a group studying microbiomes in glacier-fed streams.
When COVID hit, I began forecasting COVID outbreaks. I discovered an unconventional theory of COVID outbreaks, but due to toxicity and credentialism I didn’t publicly share a lot of my science on this topic. I shared analyses with many managers and was involved with several leading papers on COVID epidemiology on topics from syndromic surveillance, wastewater surveillance, analyses of novel variants of concern, and more. I continued to consult the quant fund, sharing my analyses and assessments with no knowledge of their positions.
I left academia because of the toxicity of COVID science communication, both within the scientific community and at the science-policy and science-public interfaces. I co-founded Selva to improve scientific communication, we were working with an accelerator (Paralect) since June 2022 to make our MVP with a launch date set for October 17th, 2022. We decided my effort is best spent doing grassroots user recruitment, so I spent time on Twitter connecting with scientists, helping with analyses, and telling people about our vision for Selva.
As a hobby/curiosity, I revisited the literature on pathogen spillover of SARS-CoV-2 and found it almost suspiciously low-quality, claiming repeatedly on insufficient evidence or inadequate methods to have concluded a natural origin of SARS-CoV-2. I wrote about the statistical challenges of inferring a natural origin with early outbreak data, came across Tony VanDongen and Valentin Bruttel, saw their pictures of the restriction map of SARS-CoV-2, and believed I could help them quantify the likelihood of that strange pattern occurring in nature.
What we did
Some online critiques have accused us of fraud, misconduct, deception, and more. This section aims to address those critiques.
We did exactly what we said in the methods section of our paper. We collected all the genomes for which we could find Spike gene open reading frames in the R package rentrez, and we manually added some genomes that didn’t show up in that analysis but which we knew to be important (e.g. RaTG13, the BANAL sequences, and a broader range of coronaviruses aimed to give us a reasonably balanced phylogenetic sampling of all coronaviruses).
The only historical point I will add to our methods section are additional details of how this pattern was found. As mentioned, I entered this research after Tony and Valentin shared a tweet showing the unusual, regular-spacing of BsaI + BsmBI restriction sites. As the statistician, I asked Tony and Valentin how they discovered this pattern as the discovery process determines which statistical statements we can and cannot make. They said they loaded the SARS-CoV-2 genome to Snapgene, focused on Golden-Gate enzymes, and the pattern struck them immediately. While someone can look into a river and see a nugget of gold, and it can indisputably be a real discovery of a real nugget of gold, we all agreed that eyeballs can pick up patterns and consequently we cannot properly call our analysis of that specific pattern “hypothesis testing”. The analysis of the BsaI/BsmBI spacing is best considered a post-hoc analysis, and so we deliberately refrained from hypothesis-testing language in favor of quantifying the likelihood of this pattern in nature (outlier analysis).
We chose our statistic based on principles. Since all fragments sum to the length of the genome, if the longest fragment is equal to the inverse of the number of fragments, it implies the sites are perfectly evenly spaced. Conversely, if the longest fragment is nearly the entire length of the genome, it implies the restriction sites are concentrated in a tiny region of the genome. When we put fragments into plasmids, they may be unstable; the probability a fragment is unstable increases with the length of the fragment so the longest fragment provides a meaningful upper bound on the probability a random fragment in the genome will lead to unstable plasmids that prohibit synthesis. We believe the longest fragment length is a beautiful statistic for this problem as it contains information on both the even-spaced-ness of restriction sites and the probability a given construct is synthesizable. As such, it was the only statistic we considered when quantifying how much of an outlier the SARS-CoV-2 restriction map was compared to the wild type distribution.
Personally, I think the dichotomy between hypothesis-testing and likelihood-quantification is a false one. The “P=0.05” cutoff we use to “reject” a hypothesis is an arbitrary one. When I read papers, I never “accept” or “reject” hypotheses but rather consider likelihood quantification as a measure of the weight of evidence or a distance of the data from some null hypothesis, as measured by some statistic. I encourage everyone else to consider this probabilistic worldview when viewing our paper: we aimed to quantify probabilities of this system occurring in nature, and P-values were convenient and commonly understood ways of communicating quantiles.
The subsequent analyses in our paper had not been examined with any eyeballs, and those can be viewed through a hypothesis-testing lens if you wish. The suitability of the sticky ends for assembly, the rate of BsaI + BsmBI mutations which are silent, and the concentration of silent mutations within BsaI + BsmBI sites, and the probability of seeing such an idealized reverse genetic system under our model of evolution were all evaluated with a clear a priori hypothesis and one “test” (quantile estimate) each. While this does fall under hypothesis-testing, I still view the resulting probabilities as quantifications of the weight of evidence and distances from a hypothesis rather than a binary acceptance/rejection of a hypothesis. Our language in our paper aimed to reflect that: we quantify the weight of evidence, the world will interpret this with their own priors. We use P-values at times because those are a commonly understood way to communicate quantiles and distances from null hypotheses.
Before releasing the pre-print, we shared our findings with many colleagues and sought their critical feedback. Over a dozen of the smartest people I’ve ever met looked at our results, shook them down, and arrived at our conclusion that the evidence is compelling and must be shared. None of them envied our position of having to be the ones to share these findings and, given the inflammatory discourse on this topic, we refrain from naming those colleagues in our paper. As everyone can see from the Twitter discourse, scientists can be extraordinarily hostile to novel theories, especially when those theories come from unfamiliar scientists. In addition to baseless accusations of fraud, misconduct, and deception, scientists with blue check-marks (not anonymous trolls) harassed Dr. Francois Balloux and others who shared their opinions. Many brilliant scientists lurk on Twitter under anonymous accounts, fearful of being doxed, called-out, and possibly fired. The online hostility from members of the scientific community creates reputational risks for scientists who deviate from particular theories of COVID epidemiology and origins - I suffered from those reputational risks throughout COVID, they are why I left academia and why I co-founded Selva to improve science communication.
At Selva, we had planned to launch October 17th way back in the first week of June. The paper was completed on an independent timeline - I first reached out to Tony on September 5th. The analyses were quick, simple, easy, and finished in two weeks. We sought feedback from colleagues, polished the draft, and the draft was nearly complete by October 15th. The other co-founders of Selva and I discussed this unfortunate timing and ultimately decided to postpone our launch because (1) sharing this paper is part of my civic duty as a scientist to say something if I see something, (2) we realized adequate public engagement on this paper would suck up all my time, and (3) we realized critics might claim the paper was just a publicity stunt, undermining both the rigor of the paper and the sincerity of Selva. Selva’s launch is thus postponed.
Selva does not pay salary. My primary source of income at the moment is capital gains from life insurance stocks I bought following the announcement of Omicron. The conventional theory of COVID was highly uncertain about mortality from Omicron whereas my theory of COVID was very certain about upper bounds of mortality from Omicron. I examined the fundamentals of life insurance companies and found one with robust revenue growth, consistent dividends, high stock price sensitivity to COVID news, and clear connections between earnings-per-share and COVID mortality. With that information, I used stock prices and P/E ratios to estimate market beliefs on upcoming COVID mortality, confirmed a major discrepancy between market beliefs and my own, and put every penny I own into a margin trade on this one life insurance stock that has performed very well.
Every night, I thank ‘whatever Gods may be’ for all that I’ve been given and pray that I may do the greatest good for the world with this one life. I love science. Science can save lives, unleash new forms of energy for a more sustainable society, transform how we access and process information, and save the world. I am a scientist because I love learning and because it’s a way for me to do good for the world.
Science is a study of cause and effect granting us a mastery over causes to achieve our desired effects. We don’t know what caused the emergence of SARS-CoV-2. An honest knowledge of what caused this pandemic can help us better achieve the desired effect of preventing future pandemics. The literature on zoonotic origin is inconclusive and in my expert opinion many virologists have been suspiciously eager to prove the conclusion with inadequate evidence. Our ignorance on the origins of SARS-CoV-2 should disturb us all. Millions of people have died of an unknown proximal cause, and that cause could strike again, killing millions more, if we don’t learn the honest truth.
There are very good reasons to suspect SARS-CoV-2 may have arisen as an accident of laboratory research, and the labs in question conducting gain of function research on coronaviruses have not been transparent with the world about the coronaviruses they’ve collected and the research they’ve conducted. If we had trust and transparency, we could have a much better understanding of whether or not a lab accident caused the pandemic. Yet, the data and lab notebooks that could reject a lab origin have not been shared by the labs that would be exonerated by this evidence were there truly a natural origin.
The pandemic has led to a surge in labs that handle dangerous pathogens. Researchers even here in the United States conduct gain of function research that risks creating a more transmissible, deadlier, or immunoevasive pathogen. Such labs and such risky virological research are precisely what may have caused millions of people to die in the COVID-19 pandemic. The same research that may have caused a pandemic is currently conducted around the world without, in my opinion, adequate regulations, oversight, or transparency.
The lack of transparency we’re seeing from labs at the heart of lab-origin hypotheses makes me deeply afraid of the surge of similar virological research absent global coordination on biosafety. If another unusual virus with anomalous sequences inserted exactly as proposed by researchers emerges far from hotspots of wildlife viral diversity and adjacent a hotspot of the proposed virological research, will we be kept in the dark about laboratory records then, too? Those who don’t know history are doomed to repeat it. If we don’t know, from honest and critical scientific examination, the true history and cause of SARS-CoV-2 that wrought havoc on our world, we are doomed to repeat it.
My desperate need for Truth is why I critically examined this issue and why I published this preprint. If I had found the literature on zoonotic origin convincing, I would have focused my energy elsewhere. If we had transparency from labs in question and could trust the governments overseeing those labs, I would have looked at the lab records, found few dark spaces in which doubt can linger, and moved on. If we examined the SARS-CoV-2 genome and found evidence against our hypothesis, or if any of the colleagues we consulted found critical flaws, the preprint would never have been published.
Instead, the literature claiming natural origin was almost suspiciously flawed. The labs in question have been silent and uncooperative. Datasets have been deleted and uncovered only by clever sleuthing of internet archives. Researchers at the heart of this issue, and the funders who supported them, have called lab origin hypotheses “conspiracy theories” without disclosing their clear conflicts of interest on this question. We were told they didn’t do gain of function research on coronaviruses only to later uncover grants proposing to insert Furin cleavage sites. We found strong evidence of synthetic origin, we were unable to reject this theory on our own, and we shared it with the world to further our collective understanding of this essential issue. Many of the same researchers, funders, close colleagues of researchers and funders, and the virology community facing risks of red tape are outraged; they are calling our preprint “confected nonsense”, and “b*******”, they are saying I don’t have a PhD, that I’m not an expert on this topic, that our work was fraudulent, that we are engaged in misconduct and deception.
Yet, hovering above all the rancor, there is Truth. There is some Truth about the origin of SARS-CoV-2 and we don’t know it. The Truth may or may not be the synthetic theory we presented. We can and should sample bats to search for progenitors. We can and should analyze genomes to understand the origin of unusual features of this unusual virus. We can and must demand transparency from labs involved in coronavirus research and the funders who funded them. We can and must stop at nothing to find out what caused millions of people to die so we can save millions of lives in the future. We can and must critically examine this issue and, if we see something in our analyses, we should say something.
Imagine if all the human capital wasted arguing on Twitter was instead spent on a dogged collaborative pursuit of this Truth. Imagine if virologists were as vocal in advocating for transparency and risk-management of risky virological research, research capable of killing millions of people, as they have been about a guy in Montana publishing a preprint with a compelling new finding. Which is riskier: gain of function research or compelling preprints? Imagine if the mere possibility of a lab origin was enough to compel the whole world to demand transparency - imagine if the time and resources and media attention spent catching bats and promoting inconclusive papers was also spent demanding information from the scientists among us who conducted research hypothesized to have caused a pandemic.
We are charging blindly into the future, erecting hundreds of new labs to handle dangerous pathogens and conduct research without global coordination on biosafety. We lack any assurances that pathogens studied or assembled in vitro will be traceable to labs and that laws will be passed prohibiting research on untraceable deadly pathogens. We have no assurances that labs near the epicenter of the next pandemic will be forced to share their records or confirm that all such records are required to be available online. We currently lack systems that can hold researchers in secretive labs run by secretive governments accountable for taking risks in research. A small number of virologists could very well have opened Pandora’s Box causing millions of people around the world to die, and our systems of media, government, and science have been unable to discover the Truth in a timely, transparent, and trusted fashion.
My intention with this paper was to find the Truth. I believe the anomalous endonuclease fingerprint of SARS-CoV-2 is evidence of a synthetic origin by methods used in Wuhan and in a handful of labs with connections to the Wuhan Institute of Virology. Our finding is also a hint at policy on global biosafety: we would be wise to require all research of dangerous pathogens use traceable constructs, with exact signatures published online. Much like we require serial numbers on guns and prohibit guns from having silencers, we should consider a global registry of dangerous viruses and prohibit unannounced, untraceable modifications.
My co-authors and I are currently handling a higher volume of feedback than we are able to deal with on any given day. If you find any flaws with our manuscript or have any constructive feedback, please feel free to email me at the email listed on the preprint. If I become aware of evidence that makes me disbelieve the conclusions of our manuscript, I will announce my updated beliefs and, with the permission of whoever discovers it, I will celebrate whoever provided that critical evidence that changed our minds. I invite everyone in the world to participate in this discussion. The science is terrifying yet fascinating. The topic is personally relevant to every person capable of being infected by a virus or impacted by pandemic policies. I invite people to prove us wrong and, if they do so, even if there are flaws in their work, I will not call them names or attack their credentials. I will celebrate their ingenuity and commitment to the Truth, and if I am proven wrong I will change my mind.
Science can save lives and revolutionize our civilization, but only if scientists and our broader society remain honest, curious, and open-minded. My final intention worth sharing is I intend to cultivate a better culture with improved discourse. Science is a social system, and like any other social system it is vulnerable to causing harm by the vitriols, tempers, and ambitions of the humans involved. I passionately invite everyone else to join me. Let’s improve discourse. As scientists, let’s lead by example. I invite everyone to refrain from ad hominem attacks, leave the culture better than you found it, be open to all evidence, consider all theories and possible explanations, and celebrate clever independent thought that is the wellspring of innovation. Stay curious. Stay open to new evidence. Don’t be afraid to admit if you are wrong as doing so brings us all closer to the Truth.
From climate change and mass extinction to disinformation and dangerous biotechnology, our civilization may be far more fragile than we realize, like a sturdy ceramic bowl that is perfectly intact until an accident that shatters it beyond repair. We are barreling towards some of the largest challenges our species has ever faced. Our actions today and the culture we create may determine what world, if any, our great grandchildren inherit. One of greatest goods we can do with the time we have is learn the true origins of SARS-CoV-2, equipping us to prevent future pandemics from killing millions of people and disrupting global political and economic systems.
Alex Washburne, PhD
Thanks for reading A Biologist's Guide to Life! Subscribe for free to receive new posts and support my work.