HierCC: a multi-level clustering scheme for population assignments based on core genome MLST

Author:

Zhou Zhemin1ORCID,Charlesworth Jane1,Achtman Mark1ORCID

Affiliation:

1. Warwick Medical School, University of Warwick, Coventry CV4 7AL, UK

Abstract

Summary Motivation Routine infectious disease surveillance is increasingly based on large-scale whole-genome sequencing databases. Real-time surveillance would benefit from immediate assignments of each genome assembly to hierarchical population structures. Here we present pHierCC, a pipeline that defines a scalable clustering scheme, HierCC, based on core genome multi-locus typing that allows incremental, static, multi-level cluster assignments of genomes. We also present HCCeval, which identifies optimal thresholds for assigning genomes to cohesive HierCC clusters. HierCC was implemented in EnteroBase in 2018 and has since genotyped >530 000 genomes from Salmonella, Escherichia/Shigella, Streptococcus, Clostridioides, Vibrio and Yersinia. Availability and implementation https://enterobase.warwick.ac.uk/ and Source code and instructions: https://github.com/zheminzhou/pHierCC Supplementary information Supplementary data are available at Bioinformatics online.

Funder

Wellcome Trust

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Reference14 articles.

1. Concordance of SNP- and allele-based typing workflows in the context of a large-scale international Salmonella Enteritidis outbreak investigation;Coipan;Microb. Genom,2020

2. SnapperDB: a database solution for routine sequencing analysis of bacterial isolates;Dallman;Bioinformatics,2018

3. A publicly accessible database for Clostridioides difficile genome sequences supports tracing of transmission chains and epidemics;Frentrup;Microb. Genom,2019

4. Resolution of a meningococcal disease outbreak from whole-genome sequence data with rapid web-based analysis methods;Jolley;J. Clin. Microbiol,2012

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3