Site-specific glycan analysis of the SARS-CoV-2 spike
- View ORCID ProfileYasunori Watanabe1,2,3,*,
- View ORCID ProfileJoel D. Allen1,*,
- View ORCID ProfileDaniel Wrapp4,
- View ORCID ProfileJason S. McLellan4,
- View ORCID ProfileMax Crispin1,†
See all authors and affiliations
Abstract
The
emergence of the betacoronavirus, SARS-CoV-2, the causative agent of
COVID-19, represents a significant threat to global human health.
Vaccine development is focused on the principal target of the humoral
immune response, the spike (S) glycoprotein, which mediates cell entry
and membrane fusion. SARS-CoV-2 S gene encodes 22 N-linked glycan
sequons per protomer, which likely play a role in protein folding and
immune evasion. Here, using a site-specific mass spectrometric approach,
we reveal the glycan structures on a recombinant SARS-CoV-2 S
immunogen. This analysis enables mapping of the glycan-processing states
across the trimeric viral spike. We show how SARS-CoV-2 S glycans
differ from typical host glycan processing, which may have implications
in viral pathobiology and vaccine design.
Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), the causative pathogen of COVID-19 (1, 2),
induces fever, severe respiratory illness and pneumonia. SARS-CoV-2
utilizes an extensively glycosylated spike (S) protein that protrudes
from the viral surface to bind to angiotensin-converting enzyme 2 (ACE2)
to mediate host-cell entry (3).
The S protein is trimeric class I fusion protein, composed of two
functional subunits, responsible for receptor binding (S1 subunit) and
membrane fusion (S2 subunit) (4, 5).
Remarkably, the surface of the envelope spike is dominated by
host-derived glycans with each trimer displaying 66 N-linked
glycosylation sites. The S protein is a key target in vaccine design
efforts (6),
and understanding the glycosylation of recombinant viral spikes can
reveal fundamental features of viral biology and guide vaccine design
strategies (7, 8).
Viral
glycosylation has wide-ranging roles in viral pathobiology, including
mediating protein folding and stability, and shaping viral tropism (9).
Glycosylation sites are under selective pressure as they facilitate
immune evasion by shielding specific epitopes from antibody
neutralization. However, we note the low mutation rate of SARS-CoV-2,
and as yet that there have been no observed mutations to N-linked
glycosylation sites (10).
Surfaces with an unusually high density of glycans can also enable immune recognition (9, 11, 12). The role of glycosylation in camouflaging immunogenic protein epitopes has been studied for other coronaviruses (10, 13, 14).
Coronaviruses form virions by budding into the lumen of endoplasmic reticulum-Golgi intermediate compartments (ERGIC) (15, 16).
However, observations of complex-type glycans on virally derived
material suggests that the viral glycoproteins are subjected to
Golgi-resident processing enzymes (13, 17).
High
viral glycan density and local protein architecture can sterically
impair the glycan maturation pathway. Impaired glycan maturation
resulting in the presence of oligomannose-type glycans can be a
sensitive reporter of native-like protein architecture (8), and site-specific glycan analysis can be used to compare different immunogens and monitor manufacturing processes (18).
Additionally, glycosylation can influence the trafficking of recombinant immunogen to germinal centers (19).
To
resolve the site-specific glycosylation of SARS-CoV-2 S protein and
visualize the distribution of glycoforms across the protein surface, we
expressed and purified three biological replicates of recombinant
soluble material in an identical manner to that which was used to obtain
the high-resolution cryo-electron microscopy (cryo-EM) structure,
albeit without glycan processing blockade using kifunensine (4).
This variant of the S protein contains all 22 glycans on the SARS-CoV-2 S protein (Fig. 1A). Stabilization of the trimeric prefusion structure was achieved using the “2P” stabilizing mutations (20)
at residues 986 and 987, a “GSAS” substitution at the furin cleavage
site (residues 682–685), and a C-terminal trimerization motif. This
helps to maintain quaternary architecture during glycan processing.
Prior to analysis, supernatant containing the recombinant SARS-CoV-2 S
was purified by size-exclusion chromatography ensure only native-like
trimeric protein was analyzed (Fig. 1B and fig. S1). The trimeric conformation of the purified material was validated using negative-stain electron microscopy (Fig. 1C).
Fig. 1
(A)
Schematic representation of SARS-CoV-2 S glycoprotein. The positions of
N-linked glycosylation sequons (N-X-S/T, where X≠P) are shown as
branches. Protein domains are illustrated: N-terminal domain (NTD),
receptor-binding domain (RBD), fusion peptide (FP), heptad repeat 1
(HR1), central helix (CH), connector domain (CD), and transmembrane
domain (TM). (B) SDS-PAGE analysis of SARS-CoV-2 S
protein expressed in human embryonic kidney 293F cells. Lane 1: filtered
supernatant from transfected cells; lane 2: flow-through from
StrepTactin resin; lane 3: wash from StrepTactin resin; lane 4: elution
from StrepTactin resin. (C) Negative-stain EM 2D class
averages of the SARS-CoV-2 S protein. 2D class averages of the
SARS-CoV-2 S protein are shown, confirming that the protein adopts the
trimeric prefusion conformation matching the material used to determine
the structure (4).
To determine the site-specific glycosylation of SARS-CoV-2 S, we
employed trypsin, chymotrypsin, and alpha-lytic protease to generate
three glycopeptide samples. These proteases were selected to generate
glycopeptides that contain a single N-linked glycan sequon. The
glycopeptides were analyzed by liquid-chromatography-mass spectrometry
(LC-MS), and the glycan compositions were determined for all 22 N-linked
glycan sites (Fig. 2).
To convey the main processing features at each site, the abundances of
each glycan are summed into oligomannose-, hybrid- and categories of
complex-type glycosylation based on branching and fucosylation. The
detailed, expanded graphs showing the diverse range of glycan
compositions is presented in table S1 and fig. S2.
Fig. 2
The
schematic illustrates the color code for the principal glycan types
that can arise along the maturation pathway from oligomannose-, hybrid-
to complex-type glycans. The graphs summarize quantitative mass
spectrometric analysis of the glycan population present at individual
N-linked glycosylation sites simplified into categories of glycans. The
oligomannose-type glycan series (M9 to M5; Man9GlcNAc2 to Man5GlcNAc2)
is colored green, afucosylated and fucosylated hybrid-type glycans
(Hybrid & F Hybrid) dashed pink, and complex glycans grouped
according to the number of antennae and presence of core fucosylation
(A1 to FA4) and are colored pink. Unoccupancy of an N-linked glycan site
is represented in grey. The pie charts summarize the quantification of
these glycans. Glycan sites are colored according to oligomannose-type
glycan content with the glycan sites labeled in green (80−100%), orange
(30−79%) and pink (0−29%). An extended version of the site-specific
analysis showing the heterogeneity within each category can be found in
table S1 and fig. S2. The bar graphs represent the mean quantities of
three biological replicates with error bars representing the standard
error of the mean.
There are two sites on SARS-CoV-2 S that are principally
oligomannose-type: N234 and N709. The predominant oligomannose-type
glycan structure observed across the protein, with the exception of
N234, is Man5GlcNAc2, which demonstrates that
these sites are largely accessible to α1,2-mannosidases but are poor
substrates for GlcNAcT-I, which is the gateway enzyme in the formation
of hybrid- and complex-type glycans in the Golgi apparatus. The stage at
which processing is impeded is a signature related to the density and
presentation of glycans on the viral spike. For example, the more
densely glycosylated spikes of HIV-1 Env and Lassa virus GPC exhibit
numerous sites dominated by Man9GlcNAc2 (21–24).
A mixture of oligomannose- and complex-type glycans can be found at sites N61, N122, N603, N717, N801 and N1074 (Fig. 2).
Of the 22 sites on the S protein, 8 contain significant populations of
oligomannose-type glycans, highlighting how the processing of the
SARS-CoV-2 S glycans is divergent from host glycoproteins (25). The remaining 14 sites are dominated by processed, complex-type glycans.
Although
unoccupied glycosylation sites were detected on SARS-CoV-2 S, when
quantified they were revealed to form a very minor component of the
total peptide pool (table S2). In HIV-1 immunogen research, the holes
generated by unoccupied glycan sites have been shown to be immunogenic
and potentially give rise to distracting epitopes (26).
The high occupancy of N-linked glycan sequons of SARS-CoV-2 S indicates
that recombinant immunogens will not require further optimization to
enhance site occupancy.
Using the cryo-EM structure of the trimeric SARS-CoV-2 S protein (PDB ID 6VSB) (4), we mapped the glycosylation status of the coronavirus spike mimetic onto the experimentally determined 3D structure (Fig. 3).
This combined mass spectrometric and cryo-EM analysis reveals how the
N-linked glycans occlude distinct regions across the surface of the
SARS-CoV-2 spike.https://science.sciencemag.org/content/sci/early/2020/05/01/science.abb9983/F3.large.jpg
Tauko.
Inga kommentarer:
Skicka en kommentar