What makes a virus more dangerous? We’ve talked about the novel coronavirus mutating and about tracking its spread based on genetic relatedness in “family trees”, but how do we know that these little changes aren’t making the virus more contagious?
In the words of Ed Yong in his comprehensive piece on novel coronavirus mutations, “viruses change all the time; strains arise when they change in meaningful ways.” New strains aren’t just a new lineage, they actually function differently: for example, if they become more or less contagious. The reason we use the name SARS-CoV-2 for all existing strains of the virus that causes COVID-19 is because functionally, we think we are still at the stage where there is only one strain. Several recent pre-print* papers, however, have showed us what to look for to see if that is changing.
In April, a group at Los Alamos National Lab in the US hypothesized that this is exactly what happened in Italy and subsequently globally. They found that a particular genetic strain of the virus (identified by specific mutations we can sequence, in this case in the important “spike” protein) had risen to high frequency in Italy at the same time that the outbreak got out of control. This version of the virus is being called D614G, meaning that the amino acid, or protein building block, at position 614 in the spike protein has changed from a D (aspartic acid) to a G (glycine) as the viral genome has mutated. Since the spike protein is how the virus interacts with and enters human cells, scientists wonder if changes in the spike protein could make the virus more or less likely to spread. In the case of this study, the authors hypothesized that the D614G strain had become dominant in Italy and elsewhere because it might be more contagious.
Other scientists quickly raised the issue that this could be due to a “founder effect”, meaning that a single strain became more common simply because it was the one that made it into the country by chance, not because it was the strongest. The success of such founder strains is very different from the success of strains that are competitive evolutionary winners, and the authors do include some early analyses to show that D614G rises to dominance even when other strains arrived before it; but one of the experiments we could do to test this claim would be to show that given equal starting odds against other strains, this one is more successful. An excellent commentary from Dr. Bill Hanage at the T. H. Chan School of Public Health at Harvard also discusses how it’s hard to tell how this “winner” strain behaved in other countries from which we have fewer sequenced viral genomes. As the genomic data becomes more complete, we’ll be able to better tell the story of the virus, as evolutionary scientist Dr. Katherine Xue begins to do in her New Yorker article on this topic.
The other primary concern scientific peers raised about this Los Alamos study was that the research didn’t show anything specific about the function of the D614G spike protein mutation. The study does share some preliminary evidence from experiments in the lab that there might be higher amounts of virus in patients with the dominant strain, but as it stands this is just a piece of the puzzle. To show that a new mutation changes the transmissibility, we need to test it directly. This could work by evaluating the viral function in mammalian cells or animal models or by making systematic changes to see what happens. On this, we are making quick progress. Two other recent studies exemplify these approaches and showcase how genomics is being used to answer questions about the function of mutations.
To directly test the function of the dominant D614G mutation identified in the Los Alamos study, a group at Scripps Research in Florida recently used a harmless virus proxy system to see if the mutation made the virus more infectious. This system uses safe viral particles commonly used in labs that are engineered to express the SARS-CoV-2 spike protein so they can test things like protein stability or cellular invasion without any risk of creating a more dangerous version of SARS-CoV-2 itself. They found that D614G stabilized the spike protein and that the cells with it had five times as many spike proteins that they could use to infect mammalian cells. It also infected mammalian cells in a lab environment very efficiently compared to cells without the D614G mutation. Although this is potentially significant work, it is also not without its constraints, which are discussed in this New York Times article highlighting the project including whether this type of mutation would affect transmission rate in any detectable way or change patient outcomes.
A group from the University of Washington in Seattle recently published the results of a functional screen called a “deep mutational scan.” This approach lets scientists systematically change every building block, the amino acids, of a protein to every other possible amino acid and see how the protein is affected. In this screen, they focused on around 200 amino acids that make up the specific part of the SARS-CoV-2 spike protein that binds to the receptor on human cells and lets the virus enter. The tools they created are all freely available. Having a functional map like this is important because if there are changes that make the virus bind less well, it’s unlikely we’d see those changes become dominant; those positions might be unlikely to mutate and therefore good targets for vaccines or therapeutics to have a lasting effect. The other reason a map like this is useful is so that if a new mutation appears in the virus, we can predict how it will affect binding – especially important if it makes the virus bind more strongly, as it did for 46% of the variants in this study. But this is also just another piece of the puzzle; better binding alone does not mean the virus will be more dangerous, which is why we need complementary research in all dimensions of this.
These pre-prints collectively demonstrate both the power of putting results out early so that others can build on them and ask the important questions, and the challenges of work entering the public domain before it has undergone peer review which might make it more ready for the spotlight. When the pre-print from Los Alamos was first released in April, it become something of a cautionary tale about when very preliminary results are over-interpreted both by the authors and especially in the media. Although the focus of the study itself was mostly about the creation of an important data analysis pipeline, the idea that a more contagious strain was circulating rose to the forefront – hyped by an LA Times article that ran away with this angle. Now that the study has undergone peer review, it has been published in Cell, with an accompanying commentary piece that discusses its limitations. Its early release in pre-print, however, has enabled other groups in the months since to start to test their hypothesis and build on their important core premise: that we need to be using genomics to study the effects of the mutations in the novel coronavirus right away.
* Pre-print status means that the research has been completed and the paper has been written, but not yet undergone the peer review that would raise key concerns and often ask for additional supporting experiments.