Whole genome sequencing ofStreptococcus pneumoniae: development, evaluation and verification of targets for serogroup and serotype prediction using an automated pipeline

Author:

Kapatai Georgia1,Sheppard Carmen L.1,Al-Shahib Ali2,Litt David J.1,Underwood Anthony P.2,Harrison Timothy G.1,Fry Norman K.1

Affiliation:

1. Respiratory and Vaccine Preventable Bacterial Reference Unit, Public Health England, London, United Kingdom

2. Infectious Disease Informatics, Public Health England, London, United Kingdom

Abstract

Streptococcus pneumoniaetypically express one of 92 serologically distinct capsule polysaccharide (cps) types (serotypes). Some of these serotypes are closely related to each other; using the commercially available typing antisera, these are assigned to common serogroups containing types that show cross-reactivity. In this serotyping scheme, factor antisera are used to allocate serotypes within a serogroup, based on patterns of reactions. This serotyping method is technically demanding, requires considerable experience and the reading of the results can be subjective. This study describes the analysis of theS. pneumoniaecapsular operon genetic sequence to determine serotype distinguishing features and the development, evaluation and verification of an automated whole genome sequence (WGS)-based serotyping bioinformatics tool, PneumoCaT (PneumococcalCapsuleTyping). Initially, WGS data from 871S. pneumoniaeisolates were mapped to reference cps locus sequences for the 92 serotypes. Thirty-two of 92 serotypes could be unambiguously identified based on sequence similarities within the cps operon. The remaining 60 were allocated to one of 20 ‘genogroups’ that broadly correspond to the immunologically defined serogroups. By comparing the cps reference sequences for each genogroup, unique molecular differences were determined for serotypes within 18 of the 20 genogroups and verified using the set of 871 isolates. This information was used to design a decision-tree style algorithm within the PneumoCaT bioinformatics tool to predict to serotype level for 89/94 (92 + 2 molecular types/subtypes) from WGS data and to serogroup level for serogroups 24 and 32, which currently comprise 2.1% of UK referred, invasive isolates submitted to the National Reference Laboratory (NRL), Public Health England (June 2014–July 2015). PneumoCaT was evaluated with an internal validation set of 2065 UK isolates covering 72/92 serotypes, including 19 non-typeable isolates and an external validation set of 2964 isolates from Thailand (n= 2,531), USA (n= 181) and Iceland (n= 252). PneumoCaT was able to predict serotype in 99.1% of the typeable UK isolates and in 99.0% of the non-UK isolates. Concordance was evaluated in UK isolates where further investigation was possible; in 91.5% of the cases the predicted capsular type was concordant with the serologically derived serotype. Following retesting, concordance increased to 99.3% and in most resolved cases (97.8%; 135/138) discordance was shown to be caused by errors in original serotyping. Replicate testing demonstrated that PneumoCaT gave 100% reproducibility of the predicted serotype result. In summary, we have developed a WGS-based serotyping method that can predict capsular type to serotype level for 89/94 serotypes and to serogroup level for the remaining four. This approach could be integrated into routine typing workflows in reference laboratories, reducing the need for phenotypic immunological testing.

Publisher

PeerJ

Subject

General Agricultural and Biological Sciences,General Biochemistry, Genetics and Molecular Biology,General Medicine,General Neuroscience

Reference57 articles.

1. Predicted functions and linkage specificities of the products of the Streptococcus pneumoniae capsular biosynthetic loci;Aanensen;Journal of Bacteriology,2007

2. The quellung reaction, a neglected microbiologic technique;Austrian;The Mount Sinai Journal of Medicine,1976

3. Prevalence of isolates of Streptococcus pneumoniae putative serotype 6E in South Korea;Baek;Journal of Clinical Microbiology,2014

4. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing;Bankevich;Journal of Computational Biology,2012

5. Genetic analysis of the capsular biosynthetic locus from all 90 pneumococcal serotypes;Bentley;PLoS Genetics,2006

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3