Leta i den här bloggen

onsdag 10 november 2021

SARS-2 viruksen S-proteiinin mutaatiomahdollisuuksista

 https://onlinelibrary.wiley.com/doi/10.1002/prot.26042

 

RESEARCH ARTICLE
Free Access

Human SARS CoV-2 spike protein mutations

First published: 09 January 2021
Citations: 27

Funding information: School of Chemistry, University of Hyderabad

 

Tässä  artikkelissa on   eri puolilta maailmaa saapuneiden  sars-2 virusrakenteiden analyysejä S-proteiinimuutoksista, joissa nähdä  heijastusta mutaatioiden tyypillisistä  ilmenemispaikoista. Artikkeli on alkuvuodesta, sen jälkeen on  tullut uusia VOC/VOI/VUM -variantteja  uusine mutaatioineen. 

Sitaattia tekstistä: 


 

TABLE 1. Geographical distribution of human SARS-CoV-2 spike proteins and their associated number of mutations
Continent Number of spike proteins Number of mutations
Africa 103 121
Asia 996 1169
Europe 370 360
North America 8268 7453
South America 29 26
Oceania 567 525

3.2 Distribution of the number of mutations at mutation sites in the spike protein sequence

Nearly one-third of the spike protein sequence is associated with mutations. The list of the mutation sites along with total number of mutations observed at individual mutation sites is shown in Table S1. The top 10 mutation sites according to the total number of occurrences were; D614(7859), L5(109), L54(105), P1263(61), P681(51), S477(47), T859(30), S221(28), V483(28), A845(24).

3.3 Mutation density in different regions of the spike protein sequence

The distribution of the mutations within different regions of the spike protein is shown in Table 2. Mutations are distributed in almost all regions of the protein. The S1D domain (594-674) that comprises the D614G mutation is the most predominant and is observed in 7859 of the 7915 mutations in this region of the spike protein. Our analysis is in agreement with the high frequency of D614G mutation in the spike proteins or the more common occurrence as reported previously.11-14 The mutation density evaluated as a function of the number of mutations observed over the sequence length corresponding to different regions in the spike protein is shown in Figure 1. The protease cleavage site (between residues 675 and 692) in the spike protein is associated with the maximum mutation density. The mutations at this site in the spike protein may be of advantage for the virus to undergo proteolytic cleavage by a large number of host enzymes in its evolution. Further, the NTD (S1A domain) is another region where mutations have accumulated relatively more in number compared with rest of the spike protein.

TABLE 2. Distribution of mutations in the different regions of human SARS-CoV-2 spike proteins
Regions Total number of mutations Number of distinct mutation types
S1A domain (1-302) 759 196
S1A-S1B linker (303-332) 31 11
S1B domain (333-527) 204 52
S1B - S1C linker (528-533) 1 1
S1C domain (534-589) 75 15
S1C - S1D linker (590-593) 0 0
S1D domain (594-674) 7915 34
Protease cleavage site (675-692) 126 35
S1-S2 subunits linker (693-710) 13 7
Central β-strand (711-737) 13 7
Downward helix (738-782) 24 18
S2’ cleavage site (783-815) 27 13
Fusion peptide (816-828) 6 3
Connecting region (829-911) 111 26
Heptad repeat region (912-983) 73 18
Central helix (984-1034) 8 6
β-hairpin (1035-1068) 7 4
β-sheet domain (1069-1133) 79 27
Heptad repeat region (1134-1213) 65 22
Transmembrane region (1214-1236) 28 7
Cytoplasmic region (1237-1273)8915

 

The spike protein comprises an N-terminal S1 subunit and a C-terminal membrane proximal S2 subunit. The S1 subunit consists S1A, S1B, S1C and S1D domains. The S1A domain, referred as N-terminal domain (NTD), recognizes carbohydrate, such as, sialic acid required for attachment of the virus to host cell surface.

 The S1B domain, referred as receptor-binding domain (RBD) of the SARS-CoV-2 spike protein interacts with the human ACE-2 receptor.2, 7 

 The structural elements within the S2 subunit comprises three long α-helices, multiple α-helical segments, extended twisted β-sheets, membrane spanning α-helix, and an intracellular cysteine rich segment. 

 The PRRA sequence motif located between the S1 and S2 subunits in SARS-CoV-2 presents a furin-cleavage site.8 

 In the S2 subunit, a second proteolytic cleavage site S2′, upstream of the fusion peptide is present. Both these cleavage sites participate in the viral entry into host cells.

 

In a study on the infectivity and reactivity to a panel of neutralizing antibodies and sera from convalescent patients,9 mutations and glycosylation site modifications have been reported in human SARS-CoV-2 spike proteins. Few mutations have been reported in the spike glycoprotein.10 The D614G mutation is reported to be relatively more common11-14 and is known to increase the efficiency of causing infection.2 Mutation sites for spike proteins from some of the SARS-CoV-2 Indian isolates have been mapped on to protein three-dimensional structure.15-17

In light of the large number of SARS-CoV-2 spike protein sequences currently available in the NCBI virus database, I intended to carry out an exhaustive analysis, in order to understand the current scenario of mutations in the spike proteins. This study informs us of all the mutations present in the human SARS-CoV-2 spike proteins relative to Wuhan-Hu-1 reference sequence from China, according to their geographical locations, positions of the mutation sites, distribution of the number of mutations at the mutation sites, the different mutation types observed so far, mutations at glycosylation sites, occurrence of multiple mutations in a single spike protein and mutations within the RBD close to the host-cell ACE-2 receptor interactions. This study has implications from the perspective of vaccine, antibody, and drug design.

3.4 Human SARS-CoV-2 spike protein mutation sites and mutation types

More than one mutation type can be found at the same position in the spike protein sequence. For instance, at position 88 the amino acid residue D is observed to be mutated either to N, E, Y, or A. At position 675, the amino acid residue Q is mutated either to R, H, K or is deleted among the spike proteins. The geographical location-wise distribution of the mutation sites and mutation types is shown in Table 3. Accordingly, the total number of mutation sites observed were; North America (300), South America (4), Europe (42), Africa (16), Asia (166), and Oceania (51) and the total number of mutation types observed were; North America (350), South America (4), Europe (43), Africa (16), Asia (181), and Oceania (51). It is clear from our study that the human SARS-CoV-2 spike protein undergoes mutations at multiple sites and there can be more than one mutation type associated with a mutation site. The D614G is the only mutation, that has so far been commonly observed among the spike proteins from all the continents. Table 3 serves as a reference to consult the presence or absence of a particular mutation within and between the various continents.

TABLE 3. Mutation sites and mutation types observed in human SARS-CoV-2 spike proteins according to geographical locations
( Katso  linkki!)

3.5 Mapping mutations in receptor-binding domain of SARS-CoV-2 spike protein

The spike protein plays a vital role for the attachment to host cell-surface specific receptors and subsequently catalyzes the virus - host cell membrane fusion required for causing infection. The RBD in spike protein interacts with host ACE-2 receptor to cause the novel coronavirus infection leading to COVID-19 disease. The three-dimensional crystal structure of human SARS-CoV-2 RBD (between residues 333 and 527) complexed with the ACE-2 receptor (PDB code: 6LZG) was used to map the mutation sites as shown in Figure 2. The RBD of human SARS COV-2 spike proteins from different continents is associated with 44 distinct mutation sites. The mutations are located at positions; 337, 344, 345, 348, 354, 357, 367, 368, 379, 382, 384, 393, 395, 403, 407, 408, 411, 413, 441, 453, 457, 458, 468, 471, 476, 477, 479, 483, 484, 485, 486, 491, 493, 494, 498, 500, 501, 506, 507, 508, 518, 519, 520, 522. The numbers marked in bold indicate instances of more than 10 occurrences of the mutations observed at the particular position. The numbers underlined indicate instances of more than five occurrences of the mutations observed at the particular position. The distribution of the number of mutations in RBD observed at the 44 different positions is shown in Figure 3. The mutation sites and mutation types of the human SARS-CoV-2 spike protein in the RBD according to the geographical location-wise distribution is included in Table 3. Accordingly, the total number of distinct mutation sites in RBD observed were; North America (27), South America (0), Europe (7), Africa (1), Asia (15), and Oceania (9) and the total number of distinct mutation types observed were; North America (28), South America (0), Europe (7), Africa (1), Asia (16), and Oceania (9). The mutations occurring in relatively large numbers in RBD were examined. A maximum of 47 mutations were observed at position 477 in the RBD of the spike protein. The S477N mutation was present in 45 spike proteins from Oceania, one from Asia (NCBI ID: QLR12405.1) and one from North America (NCBI ID: QMU91291.1). The V483A mutation was observed in 27 spike proteins representative of North America and one spike protein from Oceania (QLG76529.1) contained a mutation at the same position, but with mutation type V483F. The A344S mutation was observed 18 times and all the associated spike proteins were representative of North America.

 ...

Some of the residues in the spike protein RBD that are involved in the interactions with the ACE-2 receptor are mutated as shown in Figure 4. The residues in spike protein RBD that are ≤3.2 Å inter-atomic distance from the ACE-2 receptor as observed in the crystal structure of the human SARS-CoV-2 spike protein RBD complexed with ACE-2 receptor (PDB code: 6LZG) are; K417, Y449, Y453, A475, N487, T500, N501, and G502. The Y453 residue in spike protein RBD is close to His34 in ACE-2 receptor and T500 and N501 residues in spike protein RBD are close to Tyr41 in ACE-2.22 Few residues that are close to residues ≤3.2 Å from ACE-2 receptor are also associated with the mutations. These residues are G476 (next to A475), F486 (next to N487) and G502 (next to N501).

 ...

 

4 CONCLUSIONS (January 2021)

The human SARS-CoV-2 spike proteins comprised 400 distinct mutation sites with reference to the first human SARS-CoV-2 sequence from Wuhan-Hu-1, China. The mutations are present in 8155 proteins among 10333 human SARS-CoV-2 spike protein sequences analyzed in the present work. The total number of mutations observed were 9654 for all the spike proteins. The mutation sites are distributed over whole length of the protein sequence with the maximum mutation density being near the protease cleavage site between residues 675 and 692. The RBD is associated with 44 mutation sites and within the RBD, the mutations; S477N, V483A, A344S, N501Y were more frequent. The D614G mutation is predominant and is the only common mutation in the spike protein observed so far among all the continents. Some of the SARS-CoV-2 spike proteins are associated with the mutations at glycosylation sites. Nearly, 21% SARS-CoV-2 spike proteins have not undergone any mutations yet with respect to the first human SARS-CoV-2 Wuhan-Hu-1 reference sequence that became available during December 2019. Within the RBD, mutations were observed for Y453, G476, F486, T500, N501 that are close to the ACE-2 receptor. The mutations present at the interface between the spike protein and ACE-2 receptor could potentially affect vaccine performance and drugs designed at the interface of protein-protein interactions. Therefore, the mutations identified in the present work would be important considerations for antibody, vaccine, and drug development. ..."

Inga kommentarer:

Skicka en kommentar