Washington: In a shocking revelation, a Seattle-based virologist has claimed that the U.S. National Institutes of Health has deleted gene sequences of early Covid-19 cases from a key scientific database as per request.
The findings have now raised concerns that scientists studying the origin of the pandemic may lack access to key pieces of information.
Jesse Bloom, a virologist at the Fred Hutchinson Cancer Research Center in Seattle, described the removal of the sequencing data in a new paper posted online on bioRxiv on Tuesday.
The paper, which hasn't been peer reviewed, lclaims that Chinese researchers took virus samples from some of the earliest Covid patients in Wuhan in January and February of 2020, then posted the viral sequences to a widely used US database.
After three months the genetic information was removed to "obscure their existence", an editorial in the journal Science reported on Wednesday.
"Here I identify a data set containing SARS-CoV-2 sequences from early in the Wuhan epidemic that has been deleted from the NIH's Sequence," Bloom posted on bioRxiv.
Meanwhile the US NIH has confirmed that it deleted the sequences after receiving a request from a Chinese researcher who had submitted them three months earlier, the Wall Street Journal reported on Wednesday.
"Submitting investigators hold the rights to their data and can request withdrawal of the data," the NIH said in a statement.
The scientist "indicated the sequence information had been updated, was being submitted to another database, and wanted the data removed from SRA to avoid version control issues," NIH said.
Bloom said he started his research into the origins of the pandemic, after a team led by the World Health Organization submitted its report early in March this year. It was heavily criticised by many scientists who deemed it "extremely unlikely" that SARS-CoV-2 escaped from a laboratory.A
Bloom's search led him to a study which listed all SARS-CoV-2 sequences submitted before March 31, 2020, to the Sequence Read Archive (SRA) -- a database overseen by the National Center for Biotechnology Information, a division of NIH. But when he checked SRA for one of the listed projects, he couldn't find its sequences, the Science report said.
Further research led him to another study by Ming Wang from Wuhan University's Renmin Hospital, China, which was published in a journal Small. While the paper lists some of the earliest Wuhan Covid patients and the specific mutations in their viruses, it doesn't give the full sequence data.
Additional internet sleuthing led Bloom to discover that SRA backs up its information in Google's Cloud platform, and a search there turned up files containing some of Wang's team earlier data submissions.
The paper in Small makes no mention of any corrections to viral sequences which might explain why they were removed from SRA, which led Bloom to conclude in his preprint that "the trusting structures of science have been abused to obscure sequences relevant to the early spread of SARS-CoV-2 in Wuhan", the report said.