McDonnell Boehnen Hulbert & Berghoff LLP

The pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the most severe since the 1918 influenza pandemic (colloquially known as the Spanish Flu; see, J. Barry, The Great Influenza: The Story of the Deadliest Pandemic in History, Penguin Books; Revised ed. edition (October 4, 2005)).  The geographic origins of the virus in Wuhan, China is well-established for the pandemic, but the biological origin is less well understood (although bats are the most likely culprit).

Coronaviruses have arisen in bats, pigs, and cattle and one species, HCov-OC43 from cattle or swine, was responsible for a human pandemic in the late 19th Century.  These viruses appear to be promiscuous, being transmitted from bats, cattle, or swine to humans and from humans to tigers and pigs.  One possible reason is that coronaviruses including SARS-CoV-2 infect human cells through binding of the viral Spike protein to angiotensin I converting enzyme 2 (ACE2), a protein highly conserved in mammals related to its endogenous functions regulating vasodilation and vasoconstriction as part of the renin–angiotensin system.  Adaptation of the virus to different hosts, accompanied by changes in Spike protein structure, have been seen in the masked palm civet, believed to be an intermediate host between bats and humans; in this instance, the virus mutated at two Spike protein sites to amino acids having higher affinity binding to human ACE2.  However, the role of the civet has not been definitively established, with the Malayan pangolin being another possible intermediate.  Comparisons between SARS-CoV-2 Spike protein binding sequence and ACE2 in vertebrate species shows conservation that may be related to susceptibility, suggests animal reservoirs and intermediates between bats (the native species) and humans, and illustrates the possibilities and likelihoods that SARS-CoV-2 may infect various species including endangered species.

Recently, an international team of researchers* discussed these features of host-virus interaction in a paper entitled "Broad host range of SARS-CoV-2 predicted by comparative and structural analysis of ACE2 in vertebrates" in the Proceedings of the National Academy of Sciences.  These researchers compared the amino acid sequence believed to be involved in Spike:ACE2 binding in ACE2 proteins from 410 vertebrate species (252 mammals, 72 birds, 65 fishes, 17 reptiles, and 4 amphibians), finding the relationships and conservation between the 25 amino acid region of ACE2:


where amino acid residues shown in bold are binding "hotspots" constrained with regard to binding to SARS-CoV-2 Spike protein sequences; amino acids intervening between some of these residues are not shown.

The authors constructed five categories of binding affinity (very low-low-medium-high-very high) based on the human ACE2 sequence and report that catarrhine primates ("Old World" monkeys, apes, and humans) were the only group in the very high category; other mammals (and only mammals) fell into the medium-to-high categories.  Very high binders show essentially no variants in amino acid sequence in the population at large (although there can be low levels of variability).  The high binding category showed from one to five variations in the sequence; the majority of these variants showed predominantly conservative substitutions and were found in "12 cetaceans (whales and dolphins), 7 rodents, 3 cervids (deer), 3 lemuriform primates, 2 representatives of the order Pilosa (giant anteater and southern tamandua), and 1 Old-World primate (Angola colobus)."  While the medium category showed between two to five variants, overlapping in number of variants with the high category, these variants arose more frequently at hotspot positions or were nonconservative substitutions; domestic cat, Siberian tiger, and other felids were found in this category, as well as several "New World" primates, cattle, bison, sheep, goat, water buffalo, Masai giraffe and Tibetan antelope, significant because domestication raises the possibility of zoonotic transfer to humans.  Low binding variants had a higher proportion of nonconservative substitutions (and somewhat paradoxically included bats), while the very low category showed up to 14 mismatches in the sequence, higher frequencies of nonconservative substitutions and more of these substitutions arising at hotspot residues; this category contained "[a]ll monotremes (n = 1) and marsupials (n = 4), birds (n = 72), fish (n = 65), amphibians (n = 4), and reptiles (n = 17)" including "Chinese pangolin, Sunda pangolin, and white-bellied pangolin."  A frequent mutation in all categories was Met -> Thr (with lower frequencies of Met -> Ser, Met -> Asn, and  Met -> Ala).

These authors further reported the results of an analysis of variants in the ACE2 amino acid sequence believed to bind to the SARS-CoV-2 Spike protein; humans carrying variants that showed less effective binding were, unfortunately, very rare (<0.001) arising in ten of the twenty-five amino acid residues.  The researchers further reported that, when analyzed for expected effects on binding these assessments complemented the sequence identity analysis, where substitutions predicted to destabilize binding were less likely to arise in susceptible species.  Overall, they report "lack of major conformational changes between species" which they assert supports their choice of human ACE2 sequences as the "template" for performing their comparisons.  Generally, however, their comparisons showed that "[t]he majority of ACE2 codons are significantly conserved across vertebrates and across mammals" which they attribute to "its critical function in the renin-angiotensin system" in mammals.  Conversely, bats show higher degrees of variants that result in less favorable binding, providing perhaps an explanation of why these species are (relatively) resistant to (or at least tolerant of) the consequences of viral infection.

Combining their survey information and molecular modeling based on crystallographic assessments, these authors report that several residues on the Spike protein were positively selected for binding, while residues within (set forth in italics in the sequence above) and outside the ACE2 sequence were positively selected for Spike protein binding.

Figures 4A & 4B
As explained in the paper:  "In ACE2 (wheat-colored, with binding interface residues in yellow), selected residues occur both outside the binding interface (dark blue) and inside the binding interface (red, labeled with one asterisk)" (see Figures 4A and 4B from paper; shown above).

The authors caution that this evidence is "solely based on in silico analyses" and thus their conclusions need experimental validation.  Nevertheless, they note that "[f]ive out of six species with demonstrated susceptibility to SARS-CoV-2 infection score very high [rhesus macaque and cynomolgus macaque] or medium [domestic cat, tiger and golden hamster].  Both species susceptible to infection but asymptomatic scored low [dog and Egyptian rousette bat], and the three species resistant to infection scored either low [pig] or very low [mallard and red junglefowl]" (references omitted).  Ferrets, it seems, are an exception, scoring as low ACE2 binders but being susceptible to SARS-CoV-2 infection; while the authors provide speculations regarding why this is the case, there is no known basis for this discrepancy.  The high susceptibility of Old World monkeys suggest that monkeys indigenous to China might form a reservoir for the virus, but the same is true for cervids (deer) and some species of whale, and medium-scoring species (both domesticated animals and animals found in zoos) are also susceptible and hence may be animal reservoirs responsible for zoonotic infections in humans.  And while bats have been the most commonly attributed (or blamed) zoonotic transfer source, bat species show the highest degree of resistance to infection with SARS-CoV-2.

With regard to pangolins, these authors state:

Considerable controversy surrounds reports that pangolins can serve as an intermediate host for SARS-CoV-2, with some reports proposing that SARS-CoV-2 arose as a recombinant between bat and pangolin betacoronaviruses, while another study rejected that claim.  In our study, ACE2 of Chinese pangolin, Sunda pangolin, and white-bellied pangolin had low or very low binding score for SARS-CoV-2 S.  Binding of pangolin ACE2 to SARS-CoV-2 S was predicted using molecular binding simulations; however, neither experimental infection nor in vitro infection with SARS-CoV-2 has been reported for pangolins.  Further studies are necessary to resolve whether SARS-CoV2 S binds to pangolin ACE2 [citations omitted].

The paper concludes with a discussion of an unexplored aspect of the pandemic:  the effects of SARS-CoV-2 on threatened species, particularly those under human care to prevent their extinction (an irony consistent with the zeitgeist of 2020).  Forty percent of the species scoring in the researchers' very high, high, and medium categories are also either vulnerable, endangered, or critically endangered list established by the International Union of Conservation of Nature (IUCN) Red List of Threatened Species; another five species are threatened and two are extinct in the wild.  These data suggest to these researchers that it would be prudent to enforce  guidelines to minimize potential human-animal transmission for these species; fortunately such guidelines have been established previously by such groups as the North American Association of Zoos and Aquariums, the American Association of Zoo Veterinarians, and the European Association of Zoo and Wildlife Veterinarians.

While the social, economic, and political consequences of the SARS-CoV-2 pandemic continue to evolve, some comfort (perhaps) can be had in realizing that one consequence is an intense focus on solving the biological questions the infection has raised, with the hope that the results obtained with be beneficial in countering the effects of the virus on humanity.

* The Genome Center, University of California, Davis; School of Biology and Environmental Science, University College Dublin; Graduate Program in Pharmaceutical Sciences and Pharmacogenomics, Quantitative Biosciences Consortium, University of California, San Francisco ; Gladstone Institute of Data Science and Biotechnology, San Francisco; Cancer Program, Broad Institute of MIT and Harvard; Genetic Perturbation Platform, Broad Institute of MIT and Harvard; Max Planck Institute of Molecular Cell Biology and Genetics, Dresden; Max Planck Institute for the Physics of Complex Systems, Dresden; Center for Systems Biology Dresden; Center for Species Survival, Smithsonian Conservation Biology Institute, National Zoological Park; Department of Computational Biology, School of Computer Science, Carnegie Mellon University; Department of Ecology, Tibetan Centre for Ecology and Conservation, Hubei Key Laboratory of Cell Homeostasis, College of Life Sciences, Wuhan University,; College of Science, Tibet University, Lhasa; Department of Epidemiology & Biostatistics, Institute for Computational Health Sciences, and Institute for Human Genetics, University of California, San Francisco; Chan Zuckerberg Biohub, San Francisco; San Diego Zoo Institute for Conservation Research; Department of Evolution, Behavior, and Ecology, Division of Biology, University of California San Diego; Department of Restorative Dentistry and Biomaterials Sciences, Harvard School of Dental Medicine; School of Dental Medicine, Case Western Reserve University; Marine Mammal Program, Department of Vertebrate Zoology, Smithsonian Institution; Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University; Bioinformatics and Integrative Biology, University of Massachusetts Medical School; Program in Molecular Medicine, University of Massachusetts Medical School; and John Muir Institute for the Environment, University of California, Davis.