Leta i den här bloggen

söndag 9 februari 2020

Uusinta tietoa uudesta koronaviruksesta viruksesta (1) Vertailuun osallistetut koronavirukset taulukoituna

Articles

Genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after visiting Wuhan


Pages 221-236 | Received 16 Jan 2020, Accepted 17 Jan 2020, Published online: 28 Jan 2020

Abstract
 A mysterious outbreak of atypical pneumonia in late 2019 was traced to a seafood wholesale market in Wuhan of China. Within a few weeks, a novel coronavirus tentatively named as 2019 novel coronavirus (2019-nCoV) was announced by the World Health Organization. We performed bioinformatics analysis on a virus genome from a patient with 2019-nCoV infection and compared it with other related coronavirus genomes. Overall, the genome of 2019-nCoV has 89% nucleotide identity with bat SARS-like-CoVZXC21 and 82% with that of human SARS-CoV. The phylogenetic trees of their orf1a/b, Spike, Envelope, Membrane and Nucleoprotein also clustered closely with those of the bat, civet and human SARS coronaviruses. However, the external subdomain of Spike’s receptor binding domain of 2019-nCoV shares only 40% amino acid identity with other SARS-related coronaviruses. Remarkably, its orf3b encodes a completely novel short protein. Furthermore, its new orf8 likely encodes a secreted protein with an alpha-helix, following with a beta-sheet(s) containing six strands. Learning from the roles of civet in SARS and camel in MERS, hunting for the animal source of 2019-nCoV and its more ancestral virus would be important for understanding the origin and evolution of this novel lineage B betacoronavirus. These findings provide the basis for starting further studies on the pathogenesis, and optimizing the design of diagnostic, antiviral and vaccination strategies for this emerging infection.
KEYWORDS: CoronavirusWuhanSARSemerginggenomerespiratoryvirusbioinformatics

Introduction

Coronaviruses (CoVs) are enveloped, positive-sense, single-stranded RNA viruses that belong to the subfamily Coronavirinae, family Coronavirdiae, order Nidovirales.
There are four genera of CoVs, namely, Alphacoronavirus (αCoV), Betacoronavirus (βCoV), Deltacoronavirus (δCoV), and Gammacoronavirus (γCoV) [1].
 Evolutionary analyses have shown that bats and rodents are the gene sources of most αCoVs and βCoVs,
while avian species are the gene sources of most δCoVs and γCoVs.
CoVs have repeatedly crossed species barriers and some have emerged as important human pathogens. The best-known examples include severe acute respiratory syndrome CoV (SARS-CoV) which emerged in China in 2002–2003 to cause a large-scale epidemic with about 8000 infections and 800 deaths, and Middle East respiratory syndrome CoV (MERS-CoV) which has caused a persistent epidemic in the Arabian Peninsula since 2012 [2,3]. In both of these epidemics, these viruses have likely originated from bats and then jumped into another amplification mammalian host [the Himalayan palm civet (Paguma larvata) for SARS-CoV and the dromedary camel (Camelus dromedarius) for MERS-CoV] before crossing species barriers to infect humans.

Prior to December 2019, 6 CoVs were known to infect human, 
including 2 αCoV
(HCoV-229E and
HKU-NL63)
 and
4 βCoV
(HCoV-OC43 [lineage A],
 HCoV-HKU1 [lineage A]
 SARS-CoV [lineage B] and
 MERS-CoV [lineage C]).

 The βCoV lineage A HCoV-OC43 and HCoV-HKU1 usually cause self-limiting upper respiratory infections in immunocompetent hosts and occasionally lower respiratory tract infections in immunocompromised hosts and elderly [4].
 In contrast, SARS-CoV (lineage B βCoV) and MERS-CoV (lineage C βCoV) may cause severe lower respiratory tract infection with acute respiratory distress syndrome and extrapulmonary manifestations, such as diarrhea, lymphopenia, deranged liver and renal function tests, and multiorgan dysfunction syndrome, among both immunocompetent and immunocompromised hosts with mortality rates of ∼10% and ∼35%, respectively [5,6].

 On 31 December 2019, the World Health Organization (WHO) was informed of cases of pneumonia of unknown cause in Wuhan City, Hubei Province, China [7]. Subsequent virological testing showed that a novel CoV was detected in these patients.
As of 16 January 2020, 43 patients have been diagnosed to have infection with this novel CoV, including two exported cases of mild pneumonia in Thailand and Japan [8,9].
The earliest date of symptom onset was 1 December 2019 [10].
The symptomatology of these patients included fever, malaise, dry cough, and dyspnea. Among 41 patients admitted to a designated hospital in Wuhan, 13 (32%) required intensive care and 6 (15%) died. All 41 patients had pneumonia with abnormal findings on chest computerized tomography scans [10].
We recently reported a familial cluster of 2019-nCoV infection in a Shenzhen family with travel history to Wuhan [11]. In the present study, we analyzed a 2019-nCoV complete genome from a patient in this familial cluster and compared it with the genomes of related β CoVs to provide insights into the potential source and control strategies.

Materials and methods

Viral sequences

The complete genome sequence of 2019-nCoV HKU-SZ-005b was available at GenBank (accession no. MN975262) (Table 1). The representative complete genomes of other related βCoVs strains collected from human or mammals were included for comparative analysis. These included strains collected from human, bats, and Himalayan palm civet between 2003 and 2018, with one 229E coronavirus strain as the outgroup.

Table 1 of 3

Table 1. List of coronaviruses used in this study.
Accession numberName displayed on the treeName of full-length genomeYear
AY274119Human SARS-CoV Tor2 2003SARS-related coronavirus isolate Tor22003
AY278488Human SARS-CoV BJ01 2003SARS coronavirus BJ012003
AY278491SARS coronavirus HKU-39849 2003SARS coronavirus HKU-39849 20032003
AY390556Human SARS-CoV GZ02 2003SARS coronavirus GZ022003
AY391777Human CoV OC43 2003Human coronavirus OC432003
AY515512Paguma SARS CoV HC/SZ/61/03 2003SARS coronavirus HC/SZ/61/03 (paguma SARS)2018
EF065513Bat CoV HKU9-1 2006Bat coronavirus HKU9-12006
FJ588686Bat SL-CoV Rs672 2006Bat SARS CoV Rs672/20062006
KC881005Bat SL-CoV RsSHC014 2013Bat SARS-like coronavirus RsSHC0142013
KC881006Bat SL-CoV Rs3367 2013Bat SARS-like coronavirus Rs33672013
KY417146Bat SL-CoV Rs4231 2016Bat SARS-like coronavirus isolate Rs42312016
KY417149Bat SL-CoV Rs4255 2016Bat SARS-like coronavirus isolate Rs42552016
MG772933Bat SL-CoV ZC45 2018Bat SARS-like coronavirus isolate bat-SL-CoVZC452018
MG772934Bat SL-CoV ZXC21 2018Bat SARS-like coronavirus isolate bat-SL-CoVZXC212018
MK211377Bat CoV YN2018C 2018Coronavirus BtRs-BetaCoV/YN2018C2018
MK211378Bat CoV YN2018D 2018Coronavirus BtRs-BetaCoV/YN2018Da2018
MN975262HKU-SZ-005bHuman 2019-nCoV HKU-SZ-005b2020
NC002645Human CoV 229E 2000Human coronavirus 229E2000
NC006577Human CoV HKU1 2004Human coronavirus HKU12004
NC009019Bat CoV HKU4-1 2006Bat coronavirus HKU4-12006
NC009020Bat CoV HKU5-1 2006Bat coronavirus HKU5-12006
NC014470Bat SARS-related CoV BM48-31 2009Bat coronavirus BM48-31/BGR/20082008
NC019843Human MERS-CoV 2012Middle East respiratory syndrome coronavirus2012
aOne nucleotide was added within M gene to maintain the sequence in-frame.

Genome characterization and phylogenetic analysis

Phylogenetic tree construction by the neighbour joining method was performed using MEGA X software, with bootstrap values being calculated from 1000 trees [12]. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) was shown next to the branches [13]. The tree was drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Poisson correction method and were in the units of the number of amino acid substitutions per site [14]. All ambiguous positions were removed for each sequence pair (pairwise deletion option). Evolutionary analyses were conducted in MEGA X [15]. Multiple alignment was performed using CLUSTAL 2.1 and further visualized using BOXSHADE 3.21.
Structural analysis of orf8 was performed using PSI-blast-based secondary structure PREDiction (PSIPRED) [16]. For the prediction of protein secondary structure including beta sheet, alpha helix, and coil, initial amino acid sequences were input and analysed using neural networking and its own algorithm. Predicted structures were visualized and highlighted on the BOXSHADE alignment.
Prediction of transmembrane domains was performed using the TMHMM 2.0 server (http://www.cbs.dtu.dk/services/TMHMM/). Secondary structure prediction in the 5′-untranslated region (UTR) and 3′-UTR was performed using the RNAfold WebServer (http://rna.tbi.univie.ac.at/cgi-bin/RNAWebSuite/RNAfold.cgi) with minimum free energy (MFE) and partition function in Fold algorithms and basic options.

Results and discussion

Genome organization

The single-stranded RNA genome of the 2019-nCoV was 29891 nucleotides in size, encoding 9860 amino acids. The G + C content was 38%. Similar to other βCoVs, the 2019-nCoV genome contains two flanking untranslated regions (UTRs) and a single long open reading frame encoding a polyprotein.
The 2019-nCoV genome is arranged in the order of 5′-replicase (orf1/ab)-structural proteins [Spike (S)-Envelope (E)-Membrane (M)-Nucleocapsid (N)]−3′ and lacks the hemagglutinin-esterase gene which is characteristically found in lineage A β-CoVs (Figure 1).

Figure 1. Betacoronavirus genome organization.
 Coronavirus genome comprises of 5′ untranslated region (5UTR) including 5′ leader sequence, open reading frame (orf) 1a/b (yellow box) encoding non-structural proteins (nsp) for replication, structural proteins including envelop (orange box), membrane (red) and nucleoprotein (cyan box), accessory proteins (purple boxes) such as orf 3, 6, 7a, 7b 8 and 9b of 2019-nCoV (HKU-SZ-005b) genome, and 3′ untranslated region (3UTR).

Examples of each betacoronavirus lineage are
human coronavirus (HCoV) HKU1 (lineage A),
2019-nCoV (HKU-SZ-005b) and
SARS-CoV (lineage B),
 Human MERS-CoV and
 bat CoV HKU9 (lineage C)
and Bat CoV HKU4 (lineage D).
The length of nsps and orfs are not drawn in scale.


 The human SARS-CoV 5′- and 3′- UTR were used as references to adjust the prediction results.

 

There are 12 putative, functional open reading frames (orfs) expressed from a nested set of 9 subgenomic mRNAs carrying a conserved leader sequence in the genome, 9 transcription-regulatory sequences, and 2 terminal untranslated regions (UTR).
The 5′- and 3′-UTRs are 265 and 358 nucleotides long, respectively.
The 5′- and 3 ′-UTR sequences of 2019-nCoV are similar to those of other βCoVs with nucleotide identities of ⩾83.6%.
The large replicase polyproteins pp1a and pp1ab encoded by the partially overlapping 5′-terminal orf1a/b within the 5′ two-thirds of the genome is proteolytic cleaved into 16 putative non-structural proteins (nsps). These putative nsps included two viral cysteine proteases, namely, nsp3 (papain-like protease) and nsp5 (chymotrypsin-like, 3C-like, or main protease), nsp12 (RNA-dependent RNA polymerase [RdRp]), nsp13 (helicase), and other nsps which are likely involved in the transcription and replication of the virus (Table 2).

( There are no remarkable differences between the orfs and nsps of 2019-nCoV with those of SARS-CoV (Table 3).
 The major distinction between SARSr-CoV (SARS-related) and SARS-CoV is in orf3b, Spike and orf8 but especially variable in Spike S1 and orf8 which were previously shown to be recombination hot spots). 

Table 2. Putative functions and proteolytic cleavage sites of 16 nonstructural proteins(nsps)  in orf1a/b as predicted by bioinformatics.

NSPPutative function/domainAmino acid positionPutative cleave site
nsp1suppress antiviral host responseM1 – G180(LNGG'AYTR)
nsp2unknownA181 – G818(LKGG'APTK)
nsp3putative PL-pro domainA819 – G2763(LKGG'KIVN)
nsp4complex with nsp3 and 6: DMV formationK2764 – Q3263(AVLQ'SGFR)
nsp53CL-pro domainS3264 – Q3569(VTFQ'SAVK)
nsp6complex with nsp3 and 4: DMV formationS3570 – Q3859(ATVQ'SKMS)
nsp7complex with nsp8: primaseS3860 – Q3942(ATLQ'AIAS)
nsp8complex with nsp7: primaseA3943 – Q4140(VKLQ'NNEL)
nsp9RNA/DNA binding activityN4141 – Q4253(VRLQ'AGNA)
nsp10complex with nsp14: replication fidelityA4254 – Q4392(PMLQ'SADA)
nsp11short peptide at the end of orf1aS4393 – V4405(end of orf1a)
nsp12RNA-dependent RNA polymeraseS4393 – Q5324(TVLQ'AVGA)
nsp13helicaseA5325 – Q5925(ATLQ'AENV)
nsp14ExoN: 3′–5′ exonucleaseA5926 – Q6452(TRLQ'SLEN)
nsp15XendoU: poly(U)-specific endoribonucleaseS6453 – Q6798(PKLQ'SSQA)
nsp162'-O-MT: 2'-O-ribose methyltransferaseS6799 – N7096(end of orf1b)

Table 3. Amino acid identity between the 2019 novel coronavirus and bat SARS-like coronavirus or human SARS-CoV.

 
Amino acid identity (%)2019-nCoV2019-nCoV
vs. bat-SL-CoVZXC21vs. SARS-CoV
NSP19684
NSP29668
NSP39376
NSP49680
NSP59996
NSP69888
NSP79999
NSP89697
NSP99697
NSP109897
NSP118585
NSP129696
NSP1399100
NSP149595
NSP158889
NSP169893
Spike8076
Orf3a9272
Orf3b3232
Envelope10095
Membrane9991
Orf69469
Orf7a8985
Orf7b9381
Orf8/Orf8b9440
Nucleoprotein9494
Orf9b7373



There are no remarkable differences between the orfs and nsps of 2019-nCoV with those of SARS-CoV (Table 3).
The major distinction between SARSr-CoV and SARS-CoV is in orf3b, Spike and orf8 but especially variable in Spike S1 and orf8 which were previously shown to be recombination hot spots.

Spike

(1-13 figures!) 

 https://www.tandfonline.com/doi/full/10.1080/22221751.2020.1719902

Spike glycoprotein comprised of S1 and S2 subunits. The S1 subunit contains a signal peptide, followed by an N-terminal domain (NTD) and receptor-binding domain (RBD),
while the S2 subunit contains conserved fusion peptide (FP), heptad repeat (HR) 1 and 2, transmembrane domain (TM), and cytoplasmic domain (CP). We found that the S2 subunit of 2019-nCoV is highly conserved and shares 99% identity with those of the two bat SARS-like CoVs (SL-CoV ZXC21 and ZC45) and human SARS-CoV (Figure 2). Thus the broad spectrum antiviral peptides against S2 would be an important preventive and treatment modality for testing in animal models before clinical trials [18]. Though the S1 subunit of 2019-nCoV shares around 70% identity to that of the two bat SARS-like CoVs and human SARS-CoV (Figure 3(A)), the core domain of RBD (excluding the external subdomain) are highly conserved (Figure 3(B)). Most of the amino acid differences of RBD are located in the external subdomain, which is responsible for the direct interaction with the host receptor. Further investigation of this soluble variable external subdomain region will reveal its receptor usage, interspecies transmission and pathogenesis. Unlike 2019-nCoV and human SARS-CoV, most known bat SARSr-CoVs have two stretches of deletions in the spike receptor binding domain (RBD) when compared with that of human SARS-CoV. But some Yunnan strains such as the WIV1 had no such deletions and can use human ACE2 as a cellular entry receptor. It is interesting to note that the two bat SARS-related coronavirus ZXC21 and ZC45, being closest to 2019-nCoV, can infect suckling rats and cause inflammation in the brain tissue, and pathological changes in lung & intestine. However, these two viruses could not be isolated in Vero E6 cells and were not investigated further. The two retained deletion sites in the Spike genes of ZXC21 and ZC45 may lessen their likelihood of jumping species barriers imposed by receptor specificity.

 Spike 1 subunit:
https://www.tandfonline.com/doi/full/10.1080/22221751.2020.1719902


Spik2 2 subunit: ( link)  Figure 2. Comparison of protein sequences of Spike stalk S2 subunit. Multiple alignment of Spike S2 amino acid sequences of 2019-nCoV HKU-SZ-005b (accession number MN975262), bat SARS-like coronavirus isolates bat-SL-CoVZXC21 and bat-SL-CoVZXC45 (accession number MG772934.1 and MG772933.1, respectively) and human SARS coronavirus (accession number NC004718) was performed and displayed using CLUSTAL 2.1 and BOXSHADE 3.21 respectively. The black boxes represent the identity while the grey boxes represent the similarity of the four amino acid sequences.

RBD


 Figure 3. Comparison of protein sequences of A. Spike globular head S1, and B. S1 receptor-binding domain (RBD) subunit. Multiple alignment of Spike S1 amino acid sequences of 2019-nCoV HKU-SZ-005b (accession number MN975262), bat SARS-like coronavirus isolates bat-SL-CoVZXC21, bat-SL-CoVZXC45, bat-SL-CoV-YNLF_31C, bat-SL-CoV-YNLF_34C and bat SL-CoV HKU3-1 (accession number MG772934.1 and MG772933.1, KP886808, KP886809 and DQ022305, respectively), human SARS coronavirus GZ02 and Tor2 (accession number AY390556 and AY274119, respectively) and Paguma SARS-CoV (accession number AY515512) was performed and displayed using CLUSTAL 2.1 and BOXSHADE 3.21, respectively. The black background represents the identity while the grey background represents the similarity of the amino acid sequences. Orange box indicates the region of signal peptide, while green and blue boxes indicate the core domain and receptor binding domain respectively. Sequences of RBD, highlighted in (A) were used for comparison. External subdomain variable region of 2019-nCoV HKU-SZ-005b was predicted by comparison of amino acid similarity and published structural analysis [17]. Purple box indicates the external subdomain region.


RBD:


ORF3

Orf3b

A novel short putative protein with 4 helices and no homology to existing SARS-CoV or SARS-r-CoV protein was found within Orf3b (Figure 4). It is notable that SARS-CoV deletion mutants lacking orf3b replicate to levels similar to those of wild-type virus in several cell types [19], suggesting that orf3b is dispensable for viral replication in vitro. But orf3b may have a role in viral pathogenicity as Vero E6 but not 293T cells transfected with a construct expressing Orf3b underwent necrosis as early as 6 h after transfection and underwent simultaneous necrosis and apoptosis at later time points [20]. Orf3b was also shown to inhibit expression of IFN-β at synthesis and signalling [21]. Subsequently, orf3b homologues identified from three bat SARS-related-CoV strains were C-terminally truncated and lacked the C-terminal nucleus localization signal of SARS-CoV [22]. IFN antagonist activity analysis demonstrated that one SARS-related-CoV orf3b still possessed IFN antagonist and IRF3-modulating activities. These results indicated that different orf3b proteins display different IFN antagonist activities and this function is independent of the protein's nuclear localization, suggesting a potential link between bat SARS-related-CoV orf3b function and pathogenesis. The importance of this new protein in 2019-nCoV will require further validation and study.
Figure 4. Analysis of orf3b. A. Multiple alignment of orf3b protein sequence between 2019-nCoV (HKU-SZ-005b), SARS-CoV and SARS-related CoV. B. A novel putative short protein found in orf3b.

Orf8, jatkuu , erikseen




Inga kommentarer:

Skicka en kommentar