Why did Wuhan University researchers delete COVID-19 data at NIH?

It was NOT a cover-up and the data was NOT very valuable either, says Chinese official

Your Pekingnologist attended the press conference on COVID origin tracing Thursday morning (GMT+8) at the State Council Information Office in Beijing and ran a live-tweeting thread if you are interested.

In this brief newsletter, your Pekingnologist wants to highlight one thing, which press reports of the press conference so far - from Reuters, the Washington Post, Al Jazeera, AP, The New York Times, and Bloomberg - haven’t covered yet.

Remember this story across international media in late June - a month ago?

The New York Times: Scientist Finds Early Virus Sequences That Had Been Mysteriously Deleted

Wall Street Journal: Chinese Covid-19 Gene Data That Could Have Aided Pandemic Research Removed From NIH Database

Financial Times: US says Chinese scientists asked for removal of virus records from database

Bloomberg: U.S. Confirms Removal of Wuhan Virus Sequences From Database

Nature: Deleted coronavirus genome sequences trigger scientific intrigue

Now if all of these media outlets run a story on the same thing around the same time, there is no need to explain to you why it had been newsworthy, or Pekingnology-worthy.

What’s the story? To summarize, in the words of the Financial Times report:

Records of early Covid-19 cases in Wuhan were deleted from a US database at the request of Chinese scientists, American officials have confirmed.

A team of academics from Wuhan, where the first documented cases of Covid-19 appeared, submitted sequences of the virus that causes the disease to a US-based archive in March 2020.

Three months later, however, they asked for those sequences to be removed and the data were deleted, the US National Institutes of Health said on Wednesday, confirming the results of an investigation by biologist Jesse Bloom.

“Submitting investigators hold the rights to their data and can request withdrawal of the data,” NIH said in a statement.

The deleted information did not prove how Covid-19 first infected humans, whether via animals or a laboratory leak from the Wuhan Institute of Virology. But experts said the incident demonstrated further evidence of how Chinese researchers and officials have not been fully transparent in how they dealt with data related to the pandemic’s origins.

Basically, Jesse Bloom, a scientist in the U.S., found some SARS-CoV-2 sequencing data uploaded by some Chinese scientists were later withdrawn, then re-discovered the data on Google Cloud, then published a pre-print and the Twitter thread, leading to the news reports.

The tone of the press reports was that the withdrawal of data was suspicious at least, though some scientists were quoted to cast doubt on the significance of the deleted data, from the Wall Street Journal report

Stephen Goldstein, a University of Utah evolutionary virologist who wasn’t involved in Dr. Bloom’s research, said it was unclear if any new insights could be gleaned from the deleted sequences. “From a scientific standpoint, I don’t think they point to anything nefarious,” he said, adding that he had not made his own analysis of the sequences.

Here is some criticism from Dr. Angela Rasmussen on June 23rd

What’s the Chinese side of the story?

In today’s press conference, a question was raised on this particular incident, and Zeng Yixin, vice minister of China's National Health Commission, gave a quite detailed answer:


After this incident was reported, we immediately conducted an investigation and gained an understanding of it. Here's how it happened. The problem of sequences being deleted mentioned in the press reports originated from a paper by some researchers from Wuhan University, in the international journal Small. The title of the paper was "Nanopore Targeted Sequencing for the Accurate and Comprehensive Detection of SARS-CoV-2 and Other Respiratory Viruses."

From the title, we can see that this paper reported (centered on) a sequencing method. In March, when they submitted their paper, they needed (to show) sequencing results, that is, (after) you established such a method and conducted sequencing, how was your sequencing result? Sequencing results are needed to determine the accuracy of sequencing and whether their method is reliable.

So the researchers uploaded the sequencing results of specific COVID samples to the NCBI (National Center for Biological Information) database, which is managed by the NIH, the National Institutes of Health (of the United States).


On June 9, the journal SMALL sent the researcher the draft paper for publication. At this time, the researchers found that the original content (written by these Chinese researchers) in the paper describing the uploading address of the SARS-CoV-2 virus sequencing data of case samples had been deleted during the review process (of their paper).

Therefore, the researchers thought it was no longer necessary to store the data in the NCBI database. The researchers sent an email to the NIH on June 16 last year (2020) requesting that the data be withdrawn. The NIH deletes the data, according to protocols - if you proposed to delete it, then it will get deleted; and no further notice was sent to the (Chinese) researchers. The researcher then went on to forget about this. So, from this process, (you can see that) the researchers had no need to hide or cover up (anything) - they didn’t have this subjective intention.

Recently, the researchers have uploaded 244 sequencing-related data from all 61 COVID-19 samples to the GSA database constructed by China's National Center for Biotechnology Information. The database is open and can be seen and queried by researchers around the world.


According to our understanding, the earliest sampling time of this batch of samples was January 30 - some time has passed since the COVID outbreak began. In fact, it is not an early sample. These sequences provide limited information and value for COVID-19 origin tracing.

但是美国的一位研究人员Fred Hutchinson癌症中心的Jesse Bloom没有得到中国学者的确认,完全也不了解这个事情来龙去脉的背景下,就杜撰了所谓的阴谋论,说这是想掩盖的。他这种阴谋论在国际舆论界造成了很不好影响,对中方研究者进行了诬蔑,对中方研究者造成了伤害,他这种做法是背离科学的,也违反了科学伦理。后来论文出来以后也遭到了许多国家专家的批评,你这个做法不科学,违反科学伦理。在疫情流行期间,民众对于专业人员特别是科学家是高度关注,科学家的一言一行都是高度敏感的,所以每一名专家学者都应该明白我们肩上所肩负的社会责任,特别是像疫情流行期间,关于疫情相关的言论,老百姓是非常关注高度敏感,我们一定要明白我们身上的社会责任,要尽自己的努力为全社会的疫情防控做出我们专业人员的贡献,要正确地引导舆情,不要随心所欲的去猜测,造成不好的影响,这会把全社会的疫情防控带歪的。所以我觉得应该提醒每一位专家从这个事情上面接受教训……

However, a U.S. researcher, Jesse Bloom at the Fred Hutchinson Cancer Research Center, without confirmation from the Chinese researchers and having no knowledge of the ins and outs of the matter, made up a conspiracy theory saying it was a cover-up.

This conspiracy theory of his has caused a very bad influence in the international media, vilifying Chinese researchers and hurting Chinese researchers. His practice is a departure from science and a violation of scientific ethics.

Later, after the (Bloom’s) assertion came out, he was criticized by many experts from other countries, saying what he did was unscientific and violated scientific ethics.

During the pandemic, the public pays close attention to professionals, especially scientists. Scientists are put under the microscope in everything they say and do. Therefore, every expert and scholar should understand the social responsibility we shoulder. Particularly, during the pandemic, people paid close attention to opinions related to the pandemic. We must understand the social responsibility we bear. We must do our best to make a contribution as professionals to pandemic prevention and control in the whole society. We shall correctly guide the public opinion, and do not make speculations at will, which will cause bad effects and lead the pandemic prevention and control sideways. Therefore, I think we should remind every expert to learn a lesson from this matter

To sum up, even though a response to this drama could have come sooner, the reason behind the withdrawal of the data, as elaborated by Vice Minister Zeng, was a purely technical one, not nearly as sensational as what it had been made out to be.

With contribution from Yang Liu, host of the Beijing Channel newsletter.