skandiver: a divergence-based analysis tool for identifying intercellular mobile genetic elements

Author:

Zhang Xiaolei Brian12,Oualline Grace12,Shaw Jim3,Yu Yun William234

Affiliation:

1. Department of Biological Sciences, Carnegie Mellon University , Pittsburgh, PA 15213, United States

2. Department of Computational Biology, Carnegie Mellon University , Pittsburgh, PA 15213, United States

3. Department of Mathematics, University of Toronto , Toronto, ON M5S2E4, Canada

4. Department of Computer and Mathematical Sciences, University of Toronto at Scarborough , Toronto, ON M1C1A4, Canada

Abstract

Abstract Motivation: Mobile genetic elements (MGEs) are as ubiquitous in nature as they are varied in type, ranging from viral insertions to transposons to incorporated plasmids. Horizontal transfer of MGEs across bacterial species may also pose a significant threat to global health due to their capability to harbor antibiotic resistance genes. However, despite cheap and rapid whole-genome sequencing, the varied nature of MGEs makes it difficult to fully characterize them, and existing methods for detecting MGEs often do not agree on what should count. In this manuscript, we first define and argue in favor of a divergence-based characterization of mobile-genetic elements. Results: Using that paradigm, we present skandiver, a tool designed to efficiently detect MGEs from whole-genome assemblies without the need for gene annotation or markers. skandiver determines mobile elements via genome fragmentation, average nucleotide identity (ANI), and divergence time. By building on the scalable skani software for ANI computation, skandiver can query hundreds of complete assemblies against >65 000 representative genomes in a few minutes and 19 GB memory, providing scalable and efficient method for elucidating mobile element profiles in incomplete, uncharacterized genomic sequences. For isolated and integrated large plasmids (>10 kb), skandiver’s recall was 48% and 47%, MobileElementFinder was 59% and 17%, and geNomad was 86% and 32%, respectively. For isolated large plasmids, skandiver’s recall (48%) is lower than state-of-the-art reference-based methods geNomad (86%) and MobileElementFinder (59%). However, skandiver achieves higher recall on integrated plasmids and, unlike other methods, without comparing against a curated database, making skandiver suitable for discovery of novel MGEs. Availability and implementation https://github.com/YoukaiFromAccounting/skandiver

Funder

Natural Sciences and Engineering Research Council of Canada

Carnegie Mellon University

NSERC CGS-D

Publisher

Oxford University Press (OUP)

Reference20 articles.

1. Basic local alignment search tool;Altschul;J Mol Biol,1990

2. Navigating bottlenecks and trade-offs in genomic data analysis;Berger;Nat Rev Genet,2023

3. IMG/PR: a database of plasmids from genomes and metagenomes with rich annotations and metadata;Camargo;Nucleic Acids Res,2024

4. Identification of mobile genetic elements with geNomad;Camargo;Nat Biotechnol,2023

5. Biopython: freely available python tools for computational molecular biology and bioinformatics;Cock;Bioinformatics,2009

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3