Prediction of Breast cancer using integrated machine learning-fuzzy and dimension reduction techniques

Author:

Prusty Sashikanta1,Das Priti2,Dash Sujit Kumar3,Patnaik Srikanta4,Prusty Sushree Gayatri Priyadarsini1

Affiliation:

1. Department of Computer Science & Engineering, Siksha ‘O’ Anusandhan University, Bhubaneswar, India

2. Professor & Head of the Department, Department of Pharmacology, PRM Medical College & Hospital, Baripada, Odisha, India

3. Department of Electrical & Electronics Engineering, Siksha ‘O’ Anusandhan University, Bhubaneswar, India

4. Director of Interscience Institute of Management & Technology (IIMT), Bhubaneswar, India

Abstract

In the last two decades, regardless of epidemiological, and clinical studies, the incidence of breast cancer (BC) is still increasing. However, so far, a lot of research has been done in this field to diagnose BC, and some of them have been discussed in the literature section. But still, happening major issues while dealing with fault feature matrix, generated from traditional feature extraction methods. As a result, the complexity of fault classification has raised, which will negatively impact fault identification’s accuracy and effectiveness. Thus, in this research, a novel hybridized machine learning-fuzzy and dimension reduction (MLF-DR) model has been proposed to improve the decision capabilities and efficiency of an ML model. A feature-based class-togetherness fuzzification method has been used for every feature. The novelty of our research work is to find all possibilities between cancerous and non-cancerous cells by implementing a fuzzy inference system (FIS) in the data analysis phase, and DR techniques at preprocessing phase to select the best optimizing features. This research tries to reduce the incidence of BC and prevent needless deaths, thus will probably follow necessary action to perform i.e. (i) FIS to interpret input values; (ii) principal component analysis (PCA), and recursive feature elimination (RFE) to select best features, and (ii) logistic regression (LR) and random forest (RF) models to predict BC with these features. Furthermore, all the experiments have been done on Wisconsin Breast Cancer Dataset (WBCD), freely available on the Kaggle repository using Python programming on Jupyter Notebook version 6.4.3. The key findings of this research are that the LR-PCA (8 components) model can reliably and successfully obtain the defect diagnosis results with 99.1% accuracy, as compared to individual LR and RF models.

Publisher

IOS Press

Subject

Artificial Intelligence,General Engineering,Statistics and Probability

Reference32 articles.

1. Incidence of hereditary nonpolyposis colorectal cancer and the feasibility of molecular screening for the disease;Aaltonen;New England Journal of Medicine,1998

2. Global cancer statistics: GLOBOCAN estimates incidence and mortality worldwide for 36 cancers in 185 countries;Sung;CA: A Cancer Journal for Clinicians,2021

3. Planning for tomorrow: Global cancer incidence and the role of prevention 2020–2070;Soerjomataram;Nature Reviews Clinical Oncology,2021

4. Machine learning models in breast cancer survival prediction;Montazeri;Technology and Health Care,2016

5. SKCV: Stratified K-fold cross-validation on ML classifiers for predicting cervical cancer;Prusty;Frontiers in Nanotechnology,2022

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3