SVM-Root: Identification of Root-Associated Proteins in Plants by Employing the Support Vector Machine with Sequence-Derived Features

Author:

Kumar Meher Prabina1ORCID,Hati Siddhartha2,Sahu Tanmaya Kumar34,Pradhan Upendra5,Gupta Ajit1,Rath Surya Narayan2

Affiliation:

1. Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, 110012, India

2. Department of Bioinformatics, Odisha University of Agriculture and Technology, Bhubaneswar, Odisha, 751003, India

3. Division of Genomic Resources, ICAR-National Bureau of Plant Genetic Resources, New Delhi, 110012, India

4. Division of Crop Improvement, ICAR-Indian Grassland and Fodder Research Institute, Jhansi, 284003, India

5. ICAR-Indian Agricultural Statistics Research Institute Statistical Genetics New Delhi India

Abstract

Background: Root is a desirable trait for modern plant breeding programs, as the roots play a pivotal role in the growth and development of plants. Therefore, identification of the genes governing the root traits is an essential research component. With regard to the identification of root-associated genes/proteins, the existing wet-lab experiments are resource intensive and the gene expression studies are species-specific. Thus, we proposed a supervised learning-based computational method for the identification of root-associated proteins. Method: The problem was formulated as a binary classification, where the root-associated proteins and non-root-associated proteins constituted the two classes. Four different machine learning algorithms such as support vector machine (SVM), extreme gradient boosting, random forest, and adaptive boosting were employed for the classification of proteins of the two classes. Sequence-derived features such as AAC, DPC, CTD, PAAC, and ACF were used as input for the learning algorithms. Results: The SVM achieved higher accuracy with the 250 selected features of AAC+DPC+CTD than that of other possible combinations of feature sets and learning algorithms. Specifically, SVM with the selected features achieved overall accuracies of 0.74, 0.73, and 0.73 when evaluated with single 5-fold cross-validation (5F-CV), repeated 5F-CV, and independent test set, respectively. Conclusions: A web-enabled prediction tool SVM-Root (https://iasri-sg.icar.gov.in/svmroot/) has been developed for the computational prediction of the root-associated proteins. Being the first of its kind, the proposed model is believed to supplement the existing experimental methods and high throughput GWAS and transcriptome studies.

Publisher

Bentham Science Publishers Ltd.

Subject

Computational Mathematics,Genetics,Molecular Biology,Biochemistry

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3