HTML
-
First, we removed a total of 3, 985 hexamers that primed reverse-transcription of human or bacterial ribosomal RNA (Accession Number: NR_046235.3 of human 45S pre-ribosomal sequence, J01859.1 for Escherichia coli 16S ribosomal RNA, and NR_037007.2 for Staphylococcus aureus 16S ribosomal RNA) from a candidate dataset of 4, 096 (46) random hexamers. Second, we constructed a single FASTA sequence file with all the virus-related sequences available from NCBI (8, 584 viral genome sequences) (release data 8 July 2014) and corresponding complementary sequences (17, 168 virus-related sequences in total). Third, we aligned the remaining 111 hexamers with both virus-related sequences, and scored each of them according to the numbers of perfect matching to either sequence. Forth, we then obtained the first 30 hexamers with the highest scores in matching to virus-related sequences (sense and antisense). Finally, we intended to add 2 random nucleotides to the 5' tail of the hexame to increase flexibility and diversity of targeted priming. Thus these 6+2N octamers (V8) (listed in Table 1) were used in reverse-transcription step for VSITA.
Motif Motif Motif Motif Motif Motif NNGGCAAT NNGATATC NNGCATTG NNGTCTAG NNAGTATA NNAAGTAT NNAATTGT NNAATCTA NNAATTGT NNACAACG NNACAATT NNACTATT NNATACTT NNATTGCC NNATTTTA NNCGTATG NNCGTTGT NNCAATGC NNCAATTG NNCATACG NNCCTAGA NNCTAGAC NNCTCGAG NNTAGATT NNTAAAAT NNTATAAA NNTATACT NNTATATA NNTCTAGG NNTTTATA Table 1. Octamer (V8) Sequences Used for VSITA (5'-3')
-
Forty-five archived clinical samples of different types (serum, feces, and throat swabs) were included in this study. These samples were previously tested by quantitative RT-PCR assay (qRT-PCR) (Supplementary Table 1 available in www.besjournal.com) and found to be positive for either of 14 different virus types/subtypes [Japanese encephalitis virus, Rotavirus, Norovirus, Human Adenovirus (hADV), Dengue virus, Influenzavirus A H1N1, Influenzavirus A H3N2, Influenzavirus A swirl H1N1 (swlH1N1), Enterovirus 71 (EV71), Coxsackievirus A16 (CA16), Cytomegalovirus (CMV), Human immunodeficiency virus (HIV), Hepatitis B virus (HBV), Hepatitis C virus (HCV)]. These 45 samples were used in parallel to compare the V8 and N6 enrichment performance of viral sequences and removal performance of ribosomal sequences in the step of reverse transcription followed by quantitative PCR (qPCR). Additional two separate groups, one group containing 10 serum samples from patients with fever of unknown origin (F1-F10) and one containing 10 feces samples from patients with diarrhea of unknown origin (D1-D10), were also included in this study. These 20 samples were used in comparison of V8 and N6 enrichment performance following NGS analysis.
For all the 65 samples, total RNA was extracted with RNease Mini Plus Kit (QIAGEN) following manufacturer's instruction. To evaluate the performance of VSITA, both 4, 096 random hexamers (N6) and VSITA primer set (V8) were used in parallel to synthesize first strand of cDNA with Superscript Ⅲ (Invitrogen) for each sample following manufacture's instruction. For second strand synthesis of each sample, Klenow fragment was added to first strand synthesis system for 2-hour incubation at 37 ℃.
-
For the 45 previously validated positive samples, both N6-primed and V8 primed cDNAs were evaluated by qPCR assays of ribosomal RNA and viral target sequence in parallel according to the protocols described in previous reports (Supplementary Table 1). Previously reported sequences of primers and probes were used in this study to ensure reliability of real-time PCR results (as listed in Supplementary Table 1). For comparison, relative quantity of target sequence using V8-primed cDNA vs. N6 counterpart was calculated as follows:
$$ {\rm{R}} = {2^{({\rm{C}}{{\rm{t}}_{{\rm{N}}6}} - {\rm{C}}{{\rm{t}}_{{\rm{V}}8}})}} $$ (1) R: relative quantity of V8-primed target cDNA vs. N6-primed target cDNA; CtN6: Average Ct value of quantitative real-time PCR using N6-synthesized cDNA of four replicates; CtV8: Average Ct value of quantitative real-time PCR using V8-synthesized cDNA of four replicates.
-
For the 20 unidentified clinical samples, both V8 and N6 primed cDNAs were pre-enriched by multiple displacement amplification (MDA) using phi29 DNA polymerase (QIAGEN) according to manufacturer's instruction prior to library construction to guarantee enough amounts of templates for sequencing. Enriched cDNA products were evaluated with Agilent high sensitivity DNA assay on Agilent 2100 bioanalyzer. Fragmentation, end repair, adaptor ligation, size selection and purification were performed successively following instruction of Ion Torrent Hi-Q kit (Thermo Fisher Scientific). Adaptors with sequence barcodes were used to identify different libraries. After library evaluation with Agilent 2100 and real-time PCR, equal amount of each of the 40 libraries (20 samples for either V8 or N6 treatment) were grouped into four libraries and added to four 316v2 chips for sequencing. Sequencing was performed by 316 Chip Kit v2 on Ion Torrent PGM systems (Thermo Fisher Scientific) following instruction of Ion Torrent.
-
Raw data from sequencing was generated by software of Ion Torrent PGM system and downloaded in BAM format. The raw data was analyzed with our in-house software virus identification pipeline (VIP)[35].
Design of the V8 for VSITA
Clinical Samples and Pre-preparation
Pre-evaluation—Quantitative Real-time PCR Assay for 18S rRNA, 28S rRNA and Viral Genome
Evaluation by NGS
Data Analysis
-
Relative quantity (the ratio) of V8-synthesized cDNA of target sequences vs. those N6-synthesized cDNA of the same sample was calculated to evaluate the ability of V8 primers in enriching target viral sequences and eliminating unwanted ribosomal sequences. Compared to N6, most V8-synthesized cDNA (39/45 for 18S, 31/45 for 28S) showed obvious decrease in the quantities of human 18S and 28S ribosomal RNA sequences (Figure 1A and 1B). The discrepancy occurred in partial samples infected with Rotavirus, hADV, swlH1N1, CMV, and HIV and all the samples infected with HBV and HCV. On the other hand, viral sequences were obviously enriched by V8 in most (38/45) samples (Figure 1C). The inconsistency occurred in partial samples infected with Norovirus, hADV, H3N2, coxsackievirus 16, HIV, HBV and HCV. In general, the relative quantities of 18S and 28S sequences were under 1 (Figure 1D), indicating the better performance of VSITA to inhibit the amplification of sequences from 18S and 28S. Meanwhile, the relative quantities of viral sequences were over 1 (Figure 1D), representing the better enrichment of viral genome sequences by VSITA.
-
Fragment distribution of the enriched cDNA was evaluated by Agilent 2100 Bioanalyzer system. The obvious difference in the fragment distribution was observed between V8 and N6 priming. Taking the first-strand synthesis of 2 samples as examples for comparison (Figure 2A vs. 2B, 2C vs. 2D), the N6 priming showed higher diversity of the fragments, the fragments were more evenly distributed along the length axis (Figure 2A, 2C) while V8 priming tended to generate larger fragments more concentrated in some certain regions (Figure 2B, 2D). Results from other samples were showed in Supplementary Figure 1 (available in www.besjournal.com).
Figure 2. Fragment length analysis with Agilent 2100 bioanalyzer with: (A) N6 library from sample F1; (B) V8 library from sample F1; (C) N6 library from sample D1; (D) V8 library from sample D1.
Figure Supplementary Figure 1. Fragment length analysis with Agilent 2100 bioanalyzerwith samples: (A) N6 library from sample F1; (B) V8 library from sample F1; (C) N6 library from sample F2; (D) V8 library from sample F2; (E) N6 library from sample F3; (F) V8 library from sample F3; (G) N6 library from sample F7; (H) V8 library from sample F7; (I) N6 library from sample F10; (J) V8 library from sample F10; (K) N6 library from sample D1; (L) V8 library from sample D1; (M) N6 library from sample D5; (N) V8 library from sample D5; (O) N6 library from sample D6; (P) V8 library from sample D6; (Q) N6 library from sample D8; (R) V8 library from sample D8; (S) N6 library from sample D10; (T) V8 library from sample D10.
-
Of all the 10 samples from patients with fever of unknown origin, N6 primers approach succeeded to identify 185 reads of dengue virus (DENV Ⅲ) in sample F10 and covered 25.1% of dengue virus genome, while V8 approach identified 2, 257 reads and covered 94.88% of the same dengue virus genome (Table 2). The presence of dengue virus genome in sample F10 was later validated by Real-time PCR (Supplementary Table 1). Of 4 samples (F1, F2, F3, and F7, Table 2), a few reads were generated by V8 but not by N6 approach. However, very few reads from F1, F2, and F3 were found to match with Hantavirus and Phlebovirus and real-time PCR assay failed to amplify any sequence from these samples. For sample F7, V8 approach was only able to identify 46 reads (22.31% of genome) due to low quantity of dengue virus genome (real-time PCR Ct value of 35.14). Both methods failed in the detection of the remaining 5 samples.
Sample Number Identified Virus Ct Valuea V8 N6 Reads Hit Coverage (%) Viral Reads Percentage (%) Reads Hit Coverage (%) Viral Reads Percentage (%) F1 HFRS neg 3 2.76 0 0 0 0 F2 HFRS neg 1 0.38 0 0 0 0 F3 SFTS neg 3 0.94 0 0 0 0 F7 DENV Ⅲ 35.14 46 22.31 0.08 0 0 0 F10 DENV Ⅲ 25.97 2, 257 94.88 0.61 185 25.10 0.19 D1 Echovirus 6 24.49 1, 134 92.33 8.11 817 90.13 4.42 D1 Norovirus 28.75 48, 344 86.53 8.11 25, 276 86.93 4.42 D5 Coxsackievirus A6 24.09 233 79.54 0.50 22 39.27 0.08 D6 Poliovirus 26.70 60 71.94 0.40 20 32.59 0.14 D8 Coxsackievirus A4 26.50 171 78.26 0.80 51 47.55 0.30 D10 Coxsackievirus B2 23.47 346 96.09 1.90 116 88.58 0.50 Note. aTests with reported real-time PCR methods to validate the identification of viruses. Table 2. Comparison of the NGS Results between the Two Methods by Clinical Samples
Virus sequences were identified from 5 out of 10 diarrhea samples of unknown origin (D1, D5, D6, D8, and D10) by both V8 and N6 approaches. Both methods found co-infection of Echovirus 6 and Norovirus in sample D1 with high possibility (with genome coverage of 92.33% and 90.13% for Echovirus 6, 86.53% and 86.93% for Norovirus, respectively). Of the other four samples, infection of Coxsackievirus A6, Poliovirus, Coxsackievirus A4, Coxsackievirus B2 was found by both methods and validated by real-time PCR. For virus identification with NGS, V8 showed more viral reads hit and higher genome coverage in all the samples tested (Table 2).
Pre-evaluation by Real-time PCR
Fragment Evaluation of Enriched cDNA
NGS Evaluation
-
We acknowledge the Pediatric Research Institute, Children's Hospital of Hebei Province, Shijiazhuang, Hebei, China for providing fecal clinical samples, the National Laboratory for Hemorrhagic Fever, National Institute for Viral Disease Control and Prevention (IVDC), Chinese Center for Disease Control and Prevention (CDC) for providing serum clinical samples.
-
Name Sequence (5'-3') Citation 18S rRNAa TACCACATCCAAGGAAGGGAGCA [1] TGGAATTACCGCGGCTGCTGGCA 28S rRNAa AACGAGATTCCCACTGTCCC [1] CTTCACCGTGCCAGACTAGAG JEV AGAGCACCAAGGGAATGAAATAGT [2] AATAGGTTGTAGTTGGGCACTCTG FAM-CCACGCCACTCGACCCATAGACTG-BHQ Rota virus ACCATCTACACATGACCCTC [3] GGTCACATAACGCCCC FAM-ATGAGCACAATAGTTAAAAGCTAACACTGTCAA-BHQ Noro virus GⅡ CARGARBCNATGTTYAGRTGGATGAG [4] TCGACGCCATCT TCATTCACA FAM-TGGGAGGGCGATCGCAATCT-BHQ ADV GATGGCCACCCCATCGATGMTGC [5] GCGAACTGCACCAGACCCGGAC VIC-TACATGCACATCGCCGGACAGGAMGCTTCGGAGT-TAMRA DENV GGAAGTAGAGCAATATGGTACATGTG [6] CCGGCTGTGTCATCAGCATAYAT FAM-TGTGCAGTCCTTCTCCTTCCACTCCACT-BHQ H1N1 GACCRATCCTGTCACCTCTGAC [7] GGGCATTYTGGACAAAKCGTCTACG HEX-TGCAGTCCTCGCTCACTGGGCACG-BHQ SWL-H1N1 GGGTAGCCCCATTGCAT [8] AGAGTGATTCACACTCTGGATTTC HEX-TGGGTAAATGTAACATTGCTGGCTGG-BHQ H3N2 ACCCTCAGTGTGATGGCTTCCAAA [9] TAAGGGAGGCATAATCCGGCACAT HEX-ACGCAGCAAAGCCTACAGCAACTGT-BHQ EV71 GAG AGT TCT ATA GGG GAC AGT [10] AGC TGT GCT ATG TGA ATT AGG AA FAM-ACT TAC CCA GGC CCT GCC AGC TCC-TAMRA CaV16 GGGAATTTCTTTAGCCGTGC [11] CCCATCAARTCAATGTCCC FAM-ACAATGCCCACCACGGGTACACA-BHQ CMV CATGAAGGTCTTTGCCCAGTAC [12] GGCCAAAGTGTAGGCTACAATAG FAM-TGGCCCGTAGGTCATCCACACTAGG-TAMRA HIV AGCATTATCAGAAGGAGCCA GCAGCCTCTTCATTGATGGT FAM-TGCATGGCTGCTTGATGTCCCC-TAMRA HBV CCGTCTGTGCCTTCTCATCTG [13] AGTCCAAGAGTYCTCTTATGYAAGACCTT FAM-CCGTGTGCACTTCGCTTCACCTCTGC-MGB HCV TGCGGAACCGGTGAGTACA [14] CTTAAGGTTTAGGATTCGTGCTCAT FAM-CACCCTATCAGGCAGTACCACAAGGCC-TAMRA Enterovirusb GGCTGCGYTGGCGGCC [11] CCAAAGTAGTCGGTTCCGC FAM-CTCCGGCCCCTGAATGCGG-BHQ Note. aTested with real-time PCR by cyber green. bEchovirus 6, Coxsackievirus A6, Coxsackievirus A4, Coxsackievirus B2 were all tested by HEV primer and probe, followed by sanger sequencing for genotyping. Table Supplementary Table 1. Real-time PCR Methods of Each Viral Pathogen
1. Chuang TW, Lee KM, Lou YC, et al. A Point Mutation in the Exon Junction Complex Factor Y14 Disrupts Its Function in mRNA Cap Binding and Translation Enhancement. J Biol Chem, 2016; 291, 8565-74.
2. Kalia M, Khasa R, Sharma M, et al. Japanese encephalitis virus infects neuronal cells through a clathrin-independent endocytic mechanism. J Virol, 2013; 87, 148-62.
3. Verheyen J, Timmen-Wego M, Laudien R, et al. Detection of adenoviruses and rotaviruses in drinking water sources used in rural areas of Benin, West Africa. Appl Environ Microbiol, 2009; 75, 2798-801.
4. Kageyama T, Kojima S, Shinohara M, et al. Broadly reactive and highly sensitive assay for Norwalk-like viruses based on real-time quantitative reverse transcription-PCR. J Clin Microbiol, 2003; 41, 1548-57.
5. Krafft AE, Russell KL, Hawksworth AW, et al. Evaluation of PCR testing of ethanol-fixed nasal swab specimens as an augmented surveillance strategy for influenza virus and adenovirus identification. J Clin Microbiol, 2005; 43, 1768-75.
6. Pang Z, Li A, Li J, et al. Comprehensive multiplex one-step real-time TaqMan qRT-PCR assays for detection and quantification of hemorrhagic fever viruses. PLoS One, 2014; 9, e95635.
7. Zou S, Han J, Wen L, et al. Human influenza A virus (H5N1) detection by a novel multiplex PCR typing method. J Clin Microbiol, 2007; 45, 1889-92.
8. Poon LL, Chan KH, Smith GJ, et al. Molecular detection of a novel human influenza (H1N1) of pandemic potential by conventional and real-time quantitative RT-PCR assays. Clin Chem, 2009; 55, 1555-8.
9. Chen Y, Cui D, Zheng S, et al. Simultaneous detection of influenza A, influenza B, and respiratory syncytial viruses and subtyping of influenza A H3N2 virus and H1N1 (2009) virus by multiplex real-time PCR. J Clin Microbiol, 2011; 49, 1653-6.
10. Tan EL, Yong LL, Quak SH, et al. Rapid detection of enterovirus 71 by real-time TaqMan RT-PCR. J Clin Virol, 2008; 42, 203-6.
11. Cui A, Xu C, Tan X, et al. The development and application of the two real-time RT-PCR assays to detect the pathogen of HFMD. PLoS One, 2013; 8, e61451.
12. Sugita S, Shimizu N, Watanabe K, et al. Use of multiplex PCR and real-time PCR to detect human herpes virus genome in ocular fluids of patients with uveitis. Br J Ophthalmol, 2008; 92, 928-32.
13. Zhao JR, Bai YJ, Zhang QH, et al. Detection of hepatitis B virus DNA by real-time PCR using TaqMan-MGB probe technology. World J Gastroenterol, 2005; 11, 508-10.
14. Martell M, Gomez J, Esteban JI, et al. High-throughput real-time reverse transcription-PCR quantitation of hepatitis C virus RNA. J Clin Microbiol, 1999; 37, 327-32.