2020 May 8;S0092-8674(20)30568-7. doi: 10.1016/j.cell.2020.05.006. Online ahead of print.
Host-Viral Infection Maps Reveal Signatures of Severe COVID-19 Patients
DOI: 10.1016/j.cell.2020.05.006
Article Host-Viral Infection Maps Reveal Signatures of Severe COVID-19 Patients
Graphical Abstract
Highlights
dViral-Track: a computational framework to analyze host-viral infection maps
dViral-Track sorts infected from bystander cells and reveals virus-induced expression
dSARS-CoV-2 infects epithelial cells and alters immune landscape in severe patients
dCo-infection of SARS-Cov-2 and hMPV affects monocytes and dampens interferon response.
SUMMARY
Viruses are a constant threat to global health as highlighted by the current COVID-19 pandemic. Currently,lack of data underlying how the human host interacts with viruses, including the SARS-CoV-2 virus, limits effective therapeutic intervention. We introduce Viral-Track, a computational method that globally scans unmapped single-cell RNA sequencing (scRNA-seq) data for the presence of viral RNA, enabling transcriptional cell sorting of infected versus bystander cells. We demonstrate the sensitivity and specificity of Viral-Track to systematically detect viruses from multiple models of infection, including hepatitis B virus, in an unsupervised manner. Applying Viral-Track to bronchoalveloar-lavage samples from severe and mild COVID-19 patients reveals a dramatic impact of the virus on the immune system of severe patients compared to mild cases. Viral-Track detects an unexpected co-infection of the human metapneumovirus, present mainly in monocytes perturbed in type-I interferon (IFN)-signaling. Viral-Track provides a robust technology for dissecting the mechanisms of viral-infection and pathology.
INTRODUCTION
The development of efficient vaccines against viral pathogens is considered one of the biggest achievements of modern medicine and has significantly contributed to the increase in life expectancy worldwide. However, no vaccines exist for many life-threatening viruses such as HIV (Burton, 2019), Zika virus (Piersonand Diamond, 2018), or hepatitis C virus (HCV) (Bailey et al., 2019).Additionally, efficient broad-spectrum antiviral drugs are still missing, making infectious diseases a significant challenge for modern health systems. Viruses can also trigger or fuel non-infectious diseases such as cancer (Young and Rickinson, 2004 )and are suspected to contribute to various other chronic diseases such as Alzheimer disease (Itzhaki, 2018) and various auto-immune disorders (Mu ̈nzetal., 2009).The recent emergence of highly pathogenic viruses such as the Ebola virus and the emerging SARS-CoV-2 pandemic recalls the constant threat that viruses represent to global health. So far, the SARS-CoV-2 pandemic has caused a global financial and social catastrophe and is expected to make a significant long-lasting impact on human health (Zhu et al., 2020). Despite intensive research efforts, little is known thus far regarding the interaction of the SARS-CoV-2 virus with the human host and, as a consequence, no efficient treatment has been designed so far (Chen et al., 2020). Moreover, only few
therapeutic targets have been identified, highlighting the urgency to develop additional strategies to dissect thevirus-host interactions .
Single-cell RNA sequencing (scRNA-seq) is an emerging tech-nology that has been extensively used to study several complex diseases, including cancer (Li et al., 2019), neurodegeneration (Keren-Shaul et al., 2017), and auto-immune (Zhang et al., 2019) and metabolic diseases (Jaitin et al., 2019), providing new insights and revealing new therapeutic targets and strategies (Yofe et al.,2020). In the context of infectious diseases, scRNA-seq studies identified the underlying cells and pathways interacting with various pathogens (Drayman et al., 2019; Shnayder et al., 2018;Steuerman et al., 2018; Zanini et al., 2018). During the immune response to a pathogen, a limited number of antigen-positive or infected cells initiate and modulate the host immune response (Blecher-Gonen et al., 2019), while most of the tissue response is propagated through cytokines, such as type I interferon (IFN) signaling, to bystander, uninfected cells. It is therefore essential to develop new analytical tools to identify the rare infected cells
In order to better understand complex host-virus interactions underlying these pathologies. Multiple experimental tools have been developed over the years to track virus-infected cells in vivo, characterize the cellular state of the infected cells, and differentiate them from their bystander neighbors. These include fluorescently labeled pathogens or pathogens expressing fluorescent proteins(De Baets et al., 2015; Blecher-Gonen et al., 2019), as well as reporter mice (Lienenklausetal.,2009). However, in the case of human clinical samples,the setools are limited,making the pathogen-infected cells and viral reservoir cell types hard to detect. Viruses exploit their host cells to first express viral genes, optimize the cellular environment, and then fully activate the viral replication program. Because scRNA-seq technologies rely on poly adenylated RNA isolation and amplification, current scRNA-seq methods can, in theory, detect these viral RNA programs and there fore enable accurate identification of the bona fide infected cells and their unique properties at single-cell resolution. While such an approach has already been used to study both in vitro(Drayman et al., 2019; Shnayder et al., 2018) and in vivo infection models (Steuerman et al., 2018), no general computational frame-work has been developed to detect viruses and analyze host-viral maps in clinical samples. Here, we present a new computational tool, called Viral-Track, that is designed to systematically scan for viral RNA in scRNA-seq data of physiological viral infections using a direct mapping strategy. Viral-Track performs comprehensive mapping of scRNA-seq data on to a large database of known viral genomes, providing precise annotation of the cell types associated with viral infections. Integrating these data with the host transcriptome enables transcriptional sorting and differential profiling of the viral-infected cells compared to bystander cells.Using a new statistical approach for differential gene expression between infected and bystander cells, we are able to recover virus-induced programs and reveal key host factors required for viral replication. Viral-Track is able to annotate the viral program with high accuracy and sensitivity, as we demonstrate in several in vivo mouse models of infection, as well as human samples of hepatitis B virus (HBV) infection. Applying Viral-Track on broncho alveolar lavage (BAL) samples from moderate and severe COVID-19 patients, we reveal the infection landscape of SARS-CoV-2 and its interaction with the host tissue. Our analysis shows a dramatic impact of the SARS-CoV-2 virus on the immune system of severe patients, compared to mild cases, including replacement of the tissue-resident alveolar macrophages with recruited inflammatory monocytes, neutrophils, and macrophages and an altered CD8+T cell cytotoxic response. We find thatSARS-CoV-2 mainly infects the epithelial and macrophage subsets. In addition,Viral-Track detects an unexpected co-infection of the human metapneumovirus in one of the severe patients. This study establishes Viral-Track as a broadly applicable tool for dissecting mechanisms of viral infections, including identification of the cellular and molecular signa-tures involved in virus-induced pathologies.
RESULTS
Viral-Track: An Unsupervised Pipeline for Characterization of Viral Infections in scRNA-Seq DataAll scRNA-seq computational packages implement a pipeline that initially aligns the sequenced reads to the expressed part of a reference host genome of the relevant profiled organism. Irrelevant reads, representing other organisms, primers, adaptors, template switching oligonucleotides, and other contaminants are then commonly discarded. We reasoned that during infection, and likely many other pathological processes, these reads can potentially carry valuable information about viral RNA that is discarded in this filtering step. In order to efficiently detect viral reads from raw scRNA-seq data in an unsupervised manner, we developed Viral-Track, an R-based computational pipeline (Figure 1A;STAR Methods). Briefly, Viral-Track relies on the STAR aligner (Dobin et al., 2013) to map the reads of scRNA-seq data to both the host reference genome and an extensive list of high-quality viral genomes (Stano et al., 2016). Because viral reads are highly repetitive and generate substantial sequencing artifacts, the viral genomes identified in Viral-Track with a sufficient number of mapped reads are then filtered, based on read mapping quality, nucle-otide composition, sequence complexity, and genome coverage, to limit the occurrence of false-positives (STAR Methods). Due to the lack of high-quality viral genome annotations, Viral-Track includes de novo transcriptome assembly of the identified viruses using StringTie (Pertea et al., 2015).
Finally, viral reads are demultiplexed, quantified using unique molecular identifiers (UMI), and assigned to unique viral transcripts and cells (Figures 1A and S1A). The Viral-Track algorithm has been designed to robustly handle various types of scRNA-seq datasets, as illustrated below, and is publicly accessible at https://github.com/PierreBSC/Viral-Track. In order to evaluate the specificity and sensitivity of Viral-Track, we benchmarked Viral-Track on several scRNA-seq datasets (Table S1). These datasets include a large number of experiments we conducted, as well as published studies, that span several tissues (lung, spleen, liver, and lymph node) and awide range of viruses: influenza A, lymphocytic choriomeningitisvirus (LCMV), vesicular stomatitis virus (VSV), herpes simplex virus 1 (HSV-1), human immunodeficiency virus (HIV), and HBV.We first evaluated mouse lungs infected in vivo by influenza A virus and sequenced using MARS-seq2.0 (Keren-Shaul et al.,2019; Steuerman et al., 2018). Viral-Track analysis specifically detected the 8 distinct influenza A viral segments (NC_002016to NC_002023 Refseq nucleotide sequences) from the specific infecting strain (H1N1 Puerto Rico 8 strain) (Figure 1B). We performed transcriptome assembly to test the feasibility of reconstructing the viral transcriptome from 30-enriched scRNA-seq data. The results were highly coherent with the current knowledge of influenza A transcriptome, exemplified by Viral-Track’s ability to identify documented spliced transcript structures with single-nucleotide precision. For instance, we identified the exact location of the key splicing site on segment 7 that gives rise to M2 transcript and links nucleotides 51 and 740 (Dubois et al., 2014)(Figure 1C). Quantification of the number of viral reads across different experimental conditions was consistent with current knowledge of the disease, with lung stromal cells of non-immune lineages (CD45) exhibiting a significantly higher viral load compared to immune cells (CD45+) (p = 0.039, two-tailed Welch’s t test) (Figure 1D). As inbred mice lack the influenza-specific restriction factor Mx1, influenza A infection is extremely virulent in inbred mice (Haller et al., 1980). Moreover, all influenza A mRNA are capped and polyadenylated, making them an optimal substrate for scRNA-seq isolation and amplification protocols. We therefore evaluated the sensitivity and specificity of Viral-Track in a more challenging dataset. In this model, photoactivatable-GFP (PA-GFP) mice were infected with LCMV (Armstrong acute strain),a virus lacking strong poly(A) mRNA signals (Burrell et al.,2017), via injection to the footpad. 72 h post-infection, CD45+splenic immune cells from different spatial niches (T zone, Bzone, marginal zone, and total spleen) were profiled using the NICHE-seq technology (Medaglia et al., 2017). Even though the LCMV viral mRNAs are not polyadenylated, we detected mRNA molecules that converted to cDNA through priming of the MARS-seq oligo(dt) RT primer, and Viral-Track successfully identified the two viral segments (LCMV segment L[NC_004291] and S [NC_004294]) (Figure S1B), albeit the number of detected reads was an order of magnitude lower than the number observed in influenza A infection (Figure 1E). We detected viral reads in samples from the marginal zone, B zone, and the total spleen, but not in T zone samples, and marginal zone samples exhibited significantly higher viral load compared to B zone and total spleen samples (Figure 1E; p = 0.0067 and 0.0083 respectively, two-tailed Welch’s t test). This observation is in line with the biology of LCMV, which primarily infects macrophages and lymphocytes from the marginal zone of the spleen (Mu ̈ller et al., 2002).We next evaluated whether Viral-Track is sensitive to bar code swapping during Illumina-based scRNA-seq (Griffiths et al.,2018), which, in the case of viral RNA detection, can lead to the false assignment of viral reads to uninfected cells. To this end,we infected mice with one of two different viruses, LCMV and VSV, and performed MARS-seq2.0 on CD45+CD19 CD3 non-B/T cells from the auricular draining lymph node 1 day after infection (STAR Methods). All samples were sequenced concurrently to test for cross-sample viral read contamination.For both viruses,Viral-Track was able to identify the correct viral segments (FiguresS1C and S1D), with no cross-contamination, evident by the absence of VSV reads detected in the LCMV-infected cells and vice versa (Figure S1E).
We further generalized Viral-Track for commonly used scRNA-seq technologies and non-RNA viruses. We applied Viral-Track to scRNA-seq data from a recently publication of human primary cells infected ex vivo with HSV-1, a linear double-stranded DNA virus, generated by the Drop-seq platform (Drayman et al., 2019; Macosko et al., 2015). We found that Viral-Track detected and identified correctly HSV-1 RNA specifically in the infected samples but not in the controls (NC_001806 Refseq nucleotide sequences) (Figures S1F and S1G).
Finally, we analyzed scRNA-seq data of CD4+T cells infected ex vivo withHIV-1 (Bradley et al., 2018), generated using the droplet-based chromium platform (Zheng et al., 2017). Viral-Track successfully identified HIV as the unique virus present in the infected samples (Figures S1H and S1I), but detected significant amounts of HIV-1viral reads in one control samples probably due to ambient contamination (Yang et al., 2020).
Defining the Host Viral Interactions of HBV Using Viral-Track We further tested Viral-Track’s applicability for detecting viral reads in human clinical samples. For this purpose, we generated scRNA-seq data from a liver biopsy of an untreated hepatitis B patient and analyzed the data using Viral-Track. Viral-Track successfully identified HBV as the only virus present in the sample (Figure 1F) with 18,420 reads assigned to the HBV genome (NC_003977 Refseq sequence). Coverage analysis revealed a strong peak located at the 50 end of the C gene, encoding for the main core protein, suggesting that the HBV virus is actively producing virions (Figure 1G). We then overlaid the viral data on the host transcriptome to identify infected and bystande rpopulations. A total of 13,803 cells passed a lenient quality control, permitting apoptotic signals that may arise from viral infection. We identified several non-immune cell types (Figure S1J), including hepatocytes (expressing ALB and APOA2), as well as hepatocytes showing apoptotic signatures (ALB with high expression of mitochondrial genes), sinusoidal endothelial cells (FCN2), and epithelial cells (KRT7). We also observed several subsets of immune cells such as B cells (MS4A1), plasma cells(MZB1), conventional dendritic cells 1 (cDC1; XCR1), plasmacytoid dendritic cells (pDCs) (TCF4), and three different macrophage subsets (expressing TREM2, CD163, and FCN1,respectively). We observed a large diversity among the lymphocyte compartment with CD8+T cells (CD8A), Th17 cells (CCR6, IL23A), gdT cells (TRGC1), activated CD4 T cells (LEF1, OX40),natural killer (NK) cells (NKG7), and a distinct cluster of activated CD8+T cells (CSF2 and TOX2). We analyzed infected cells using automated thresholding over the viral signal (Figure S1J;STARMethods). As expected, hepatocytes and apoptotic hepatocytes were strongly enriched among the infected cells (Figures 1H and S1K). Interestingly, we also detected viral reads in non-hepatocyte clusters, including two subsets of macrophages (CD163+and TREM2+populations, respectively), the cDC1 subset (XCR1+), as well as endothelial (OIT3+cells) and epithelial cells (KRT7+)(Figures 1H and S1K). Infection of non-hepatocyte clusters, although with relatively low viral load, is coherent with several studies, reporting active infection of macrophages (Faure-Dupuy et al., 2019).
Together, this extensive list of validations demonstrate that Viral-Track is a sensitive and accurate method to detect and identify, in an unsupervised manner, virus strains in diverse scRNA-seq samples, in different tissues, and at varying viral types and loads. Importantly, Viral-Track can be applied to human clinical samples to extract valuable insight into the biology of the host-virus interactions.Viral-Track Identifies Infected versus Bystander Cells and Uncovers Virus-Induced Pathways
To further evaluate the accuracy of Viral-Track against a well-esablished model for tracking infection in single cells, we infected mice with a GFP-expressing LCMV virus (LCMV-GFP virus) (Med-aglia et al., 2017). We performed MARS-seq on GFP+splenocytes
and total spleen cells 72 h post-infection and analyzed the sequenced cells (Figures S2A and S2B;STAR Methods). GFP+cells were enriched for vUMI+cells compared to total spleen (Fig-ure S2A). We then calculated whether the cells positive for the LCMV-GFP signal (GFP+cells) were similar to the ones designated by Viral-Track as containing viral UMIs (vUMI+). Following clustering and annotation, we observed similar proportions of GFP+and vUMI+cells across cell clusters (Figures 2A and S2C ;R = 0.95, p = 9.0 * 1012), with monocytes, marginal zone B cells (MZBs), and macrophages being the major infected cell types. We then evaluated the transcriptional signatures within these two sets of cells by computing the Pearson correlation between each pair of cells. We observed similar distribution of Pearson correlation within the GFP+and vUMI+monocyte cells (Figure 2B) that was significantly higher (median correlation of 0.65, 0.64, and 0.51,respectively) than the correlation observed between GFP vUMI bystander monocytes. We conclude that Viral-Track correctly identifies a homogeneous set of infected cells from in vivo scRNA-seq samples similar to the one identified by conventional reporter viruses, even in the more difficult scenario in which viral transcripts are poorly polyadenylated.
We next evaluated the ability of Viral-Track to detect host factors associated with virus replication. For this purpose, we developed a statistical method that detects differentially expressed genes based on data binarization and complementary log-log regression (STAR Methods;Methods S1). We used this approach to test for transcriptional differences between bystander and infected cells during spleen LCMV infection across the three main infected cell types: macrophages, MZB cells, and monocytes. We observed that MZB cells were the most influenced by the viral infection, compared to monocytes and macrophages (107, 42,and 3 genes upregulated, respectively, Zscore >3) (Figure 2C).We performed Gene Ontology enrichment analysis on the upregulated genes in MZB cells and observed a significant enrichment in several pathways, including ‘‘chromosome organization,’’ ‘‘DNA replication,’’ and ‘‘cell cycle,’’ suggesting that LCMV triggers cell division in MZB cells (Figure 2D). Indeed, LCMV-infected MZB cells exhibited higher levels of cell cycle-related genes such as Smc2(required for chromatin condensation),Cdc6 (regulator ofDNA replication), and Stmn1(regulator of mitotic spindle) (Figures2E and S2D), but also fibrillarin (Fbl), a host factor whose expression is required by several viruses (Deffrasnes et al., 2016)(Figure2E)
This is in line with a previous report highlighting the ability of LCMV to trigger an abortive form of cell division blocked in the G1 phase (Beier et al., 2015). Altogether, our results show that Viral-Track is sufficient to detect infected cells in in vivo scRNA-seq data and infer the differential gene expression in infected versus bystander cells
.A Single-Cell Map of SARS-CoV-2 Infection in Mild and Severe Patients COVID-19 is a viral disease caused by SARS-CoV-2 infection,which has recently been recognized as the cause for a pandemic (Wang et al., 2020a). Little is currently known about the course of the disease and how the virus interacts with the host immune system in its mild and severe manifestations. To gain insights on the infection course in humans, we performed scRNA-seq and Viral-Track analysis on BALF samples from three mild and six severe COVID-19 patients (Liao et al., 2020). In total, 50,615 cells passed quality control and were analyzed using the Meta Cell algorithm (Baran et al., 2019) (Figure 3A;STAR Methods). Meta-cell analysis coarsely grouped the metacells into the myeloid, lymphoid, and epithelial lineages, and each lineage was further subdivided into smaller subsets (Figures 3A, 3B and S3A).
Among epithelial cells, we identified epithelial progenitors (expressing SOX4), type II alveolar cells (AT2, expressing SFTPB), ciliated cells (FOXJ1), ionocytes (CFTR), goblet cells (MUC5B), and clubcells (SCGB1A1;Figure S3B).
Lymphoid cells consisted several subtypes of CD4+T cells, including naive CD4+T cells (expressing CCR7), regulatory T cells (Treg, expressing FOXP3), and T follicular helper cells (Tfh, expressing CXCL13 and PDCD1), but also diverse CD8+subsets, such as NK cells (NCAM1), resident memory CD8+T cells (Trm,CD8A, and ZNF683), effector CD8+T cells (GZMA and GZMK), and cytotoxic CD8+T cells (GNLY,PRF1), as well as B cells (CD79A;Figure S3C).
The myeloid compartment exhibited a high diversity of cell states, including neutrophils (FCGR3B), mast cells (CPA3), alveolar macrophages (FABP4), dendritic cells (DCs; FSCN1), and plasmacytoid DCs(pDC; TCF4) as well as a large diversity of monocytes (FCN1)and monocyte-derived macrophages (SPP1) sub-populations(Figure S3D). These results were robust across different analysis platforms (Liao et al., 2020).
Comparison of the cellular landscape of mild and severe patients revealed key differences in the composition of BAL samples (Figures 3B and 3C). We found changes to each of the three compartments (Figures 3D–3F andS3E–S3G).
While alveolar macrophages and pDC where enriched in the myeloid compartment in the mild patients, the severe patients’ myeloid cells were characterized by a patient-specific diversity associated with accumulation of neutrophils, FCN1+monocytes, and monocyte-derived SPP1+macrophages (Figures 3D andS3E). Additionally, NK cells and naive CCR7+CD4+T cells were consistently enriched across severe patients BAL, while ZNF683 hi CD8+Trm cells were specific to mild patients (Figures3E andS3F).
We also observed changes in the epithelial compartment, as severe patients exhibited higher numbers of club cells and AT2 cells (Figures 3F andS3G). By investigating expression patterns of shared gene expression programs, we observed that cytotoxic CD8+cells and the CD4+Tfh cells are the most proliferative compartments (Figure 3G), while a broad interferon type I response, a hallmark of viral response, is mainly expressed by neutrophils and, to a lesser extent, FCN1+mono-cytes (Figure 3H)
. We next performed in-depth differential gene expression analysis between subsets characteristic of mild or severe patients. We found that CD4+T cells in the severe patients exhibit a more naive phenotype, expressing higher levels of IL7R, CCR7, S1PR1, and LTB. The CD8+Trm cells signatures are restricted to the mild patients and have higher levels of the effector molecules XCL1, ITGAE, CXCR6, and ZNF683 (Figure 3I).
Comparing gene expression differences in myeloid types between severe and mild patients revealed disease severity-associated upregulation of inflammatory chemokine genes in SPP1+monocyte-derived macrophages populations (CCL2,CCL3, CCL4, CCL7, and CCL8;Figure 3J), as well as genes associated with hypoxia or oxidative stress (HMOX1 and HIF1A), and downregulation of MHC class II (HLA-A and HLA-DQA1) and type I IFN genes (IFIT1 and OAS1).
Alveolar macrophages displayed a severity-associated signature, including upregulation of the chemokines CCL18 and CCL4L2 and the cathepsins CTSL and CTSB (Figure 3J).
Together, we identified dramatic differences between the mild and severe COVID-19 patients, including an inflammatory signature and a perturbed immune response associated with the severe manifestation of the COVID-19 disease. These also highlight potential immunotherapy treatment of the severe patients by targeting the hyper inflammatory response that is activated by inflammatory cytokines such as interleukin (IL)-6 and IL-8 (Liu et al.,2019)(Figure S3H)
Viral-Track Identifies Co-infection of SARS-CoV-2 with the Human Metapneumovirus
To characterize the invivo cross talk of SARS-CoV-2 with its human host, we applied Viral-Track on the data generated from the nine SARS-CoV-2 patients and the rich cellular landscape we identified. SARS-CoV-2 transcripts were detected in all six severe samples in variable amounts, ranging from less than 400 transcripts to more than 15,000 (Figures 4A and S4A). In contrast, no viral reads were detected in the three mild patients (Figure 4A). Coverage analysis revealed that the majority of the viral reads mapped to the 30 end of the viral segment and corresponded to positive-stranded RNA (Figure 4B). This is in agreement with the coronavirus transcription: due to a nested transcription process all genomic and subgenomic RNA molecules share the same 30end (Masters,2006). We then analyzed the enrichment of vUMIs in the cell populations represented in the BAL samples. We observed a strong enrichment of viral reads in the ciliated and epithelial progenitor population, two known cellular targets of the virus, which express the main receptor of the SARS-CoV-2 virus ACE2, as well asTMPRSS2, a protease essential for SARS-CoV-2 entry (Figures 4C and S4B;Table S2)(Hoffmann et al., 2020). We also observed enrichment of SARS-CoV-2 reads in the SPP1+macrophage population, suggesting either that SARS-CoV-2 can infect immune cells from the myeloid compartment or that SPP1+macrophages phagocytose infected cells or viral particles. Differential gene expression analysis between vUMI+infected and vUMI bystander SPP1+macrophages in the patients with the highest viral load, revealed that infected macrophages have a higher expression of chemokines (CCL7, CCL8, and CCL18) and APOE,and a lower expression of TAOK1, a serine/threonine-protein kinase in the p38 MAPK cascade (Figure S4C).
Interestingly,CD147 (also known as BSG), a potential new SARS-CoV-2receptor (Wang et al., 2020b), is expressed by all cell types,including immune cells, suggesting alternative routes for the virus to infect these cells.Often in cases of infectious diseases, the specific infecting virus is not known, or may be accompanied by co-infection with additional unknown viruses. Viral-Track applies an unsupervised mapping strategy and is optimally designed to systematically profile the source of infection or co-infections in human clinical samples. To our surprise, Viral-Track analysis of data from one of the severe patients (S1) revealed the presence of a second virus, the human metapneumovirus (hMPV) (NC_039199 Refseqsequence,Figure 4D) with more than one million reads mapped to hMPV in this specific patient. hMPV is a non-segmented, single-stranded, and negative-sense RNA virus that is responsible for upper and lower respiratory tract infections in mostly young(<5 also="" as="" but="" can="" children="" elderly="" im="" span="" target="" well="" years="">muno-compromised patients (Panda et al., 2014). hMPV has been implicated as a possible source of co-infection with the original SARS-CoV virus (Chan et al., 2003).Coverage analysis revealed that most reads fall into the N, P, M,F,M2, SH, G,but not L, genes of hMPV (Figure4E). We observed atypical pattern of biased scRNA-seq coverage, indicating that the N, P, M, F, M2, SH, and G genes are actively transcribed, and suggesting that the hMPV was active and replicating at the time of sample collection. Analysis of the viral UMI distribution across cells revealed a substantial viral load in a large subset of the cells,spanning hundreds to thousands vUMIs per infected cell (Figure 4F), independently of the total host UMIs in that cell (Figure S4D). We mapped the infected cells and characterized their distribution across cell types. The infected patient is characterized by high levels of monocytes and CD4+T cells(Figure S4E). Unlike the SARS-CoV-2 virus infection map, hMPV-infected cells were highly enriched in the monocyte compartment but not in the epithelial and SPP1+macrophage compartments (Figure 4G). We tested whether the hMPV could alter the function of the infected monocytes, and therefore influence the course of the disease. Using Viral-Track, we detected a large number of up- and downregulated genes in infected monocytes compared to bystander monocytes (Figure 4H). Interestingly, several key receptor genes required for monocyte activation such as CD16 (FCGR3B),G-CSFreceptor(CSF3R),and the formylpeptide receptor (FRP1) were downregulated in the infected compared to the bystander cells.Moreover, weobserved a dramatic downregulation of type I Interferon signaling and interferon stimulated genes (ISGs), including viral restriction factors,(e.g., IFIT3). A gene set enrichment analysis (Figure S4F) revealed a strong enrichment of interferon response genes in the downregulated gene set, suggesting that the hMPV is strongly downregulating the IFN response pathway. Several anti-inflammatory genes were upregulated, including LILRB4 (a potent inhibitor of monocyte activation) (Lu et al., 2009)and MITF, a transcription factor known to be a critical suppressor of innate immunity (Harris et al., 2018). 5>
Last, we observed a positive and significant association between total number of hMPV UMIs and production of type I IFN, highlighting that while hMPV dampens the response to type I IFN, production of this signal is highly restricted to a rare (~1%) population of cells with a high viral load (Figure S4G). Altogether, our analysis described the distribution of SARS-CoV-2-infected cells in patient’s BAL and revealed the presence of a viral co-infection by the hMPV that dampens the immune activation of the monocyte compartment in the infected patient. Further large-scale analyses of mild versus severe patients need to be conducted to better understand if the co-infection is correlated or even causative in SARS-CoV-2 pathology.
DISCUSSION
The virosphere contains hundreds of thousands of species that constantly interact with their host cells. Over the years, several genomic techniques have been developed to detect virus-derived sequences in human samples. For instance, deep sequencing as-
says are unbiased and sensitive in their ability to detect extremely rare viral sequences (Moustafa et al., 2017), but do not provide information about the infected cells and the cellular changes induced by the infection. Alternatively, it is possible to combine DNA probes with scRNA-seq to enrich for viral sequences and increase the sensitivity of the assay, but this requires prior knowledge of the viruses present in each sample (Zanini et al., 2018).
Here, we present Viral-Track, a robust and unsupervised computational pipeline that can detect viral RNA in any scRNA-seq data-set without the need for experimental modifications or prior knowledge of the infecting agent. Viral-Track was benchmarked on data originating from various tissues, infected by viruses with marked differences in their RNA properties, and generated with different scRNA-seq platforms. We demonstrate that Viral-Track can readily provide essential information on infection status in clinical samples, identify infected cells, probe viral-induced transcriptional alterations, and reveal cases of co-infection. In practice, only 70%–85% of scRNA-seq reads map to the host genome and represent polyadenylated exonic host transcripts ,where as the remainder of the data is usually overlooked in analysis. We show that these unmapped scRNA-seq reads, in pathological human samples, potentially contain valuable information on viral infection and can be effectively used for viral genome assembly. Viral-Track can resolve complex cellular ecosystems perturbed by viral infection and provide an unbiased map of the infected cells, as well as the transcriptional perturbations induced by the virus at the single cell level. We combine Viral-Track with a novel statistical approach to detect differentially expressed genes from scRNA-seq data, therefore allowing the detection of gene expression changes triggered by viral infection and differentiating them from the more abundant bystander effects, such as type I IFN signaling, at the single cell level.
Further advances will focus on applying Viral-Track on large scale datasets containing scRNA-seq data from dozens of samples, leading to robust single-cell viral meta genomic studies that characterize the viral evolution and interactions of virus-induced disease mechanisms with host genetics .
Here, we applied scRNA-seq and Viral-Track analysis to COVID-19 patient-derived samples to provide a cellular and viral atlas of the BAL lung cells from COVID-19 patients. This analysis revealed the diversity of the immune responses across COVID-19 patients and between mild and severe patients.
We expect that as the pandemic keeps spreading and global research efforts grow, additional scRNA-seq samples from COVID-19 patients will be generated, including patients treated with emerging immunotherapies (Liu et al., 2019). Such an approach might help to solve key questions including the contribution of the humoral response (Iwasaki and Yang, 2020), the role of the IL6 pathway(Herold et al., 2020), and the immune memory induced by the vi-rus (Prompetchara et al., 2020).
Viral-Track can contribute to the global effort to identify the different cellular compartments that are targeted and affected by COVID-19 and other viruses and to detect possible co-infection by unexpected viruses. Coinfections are gaining recognition in the scientific and medical community as critical factors in disease prognosis (Zhang et al.,2020). So far, research focused mainly on co-infections of bacterial sources or of well-known viruses such as influenza A (Wuet al., 2020). Understanding the diversity of viral co-infections and their mechanisms of immune suppression at the cellular
and molecular level could therefore provide highly valuable information and lead toward possible therapeutic targets, especially for severe patients, whose treatment options are limited.
Limitations
Viral-Track is a new and powerful tool ..