Affiliation:
1. Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Malaysia
2. Faculty of Computer and Information Technology, Sana’a University, Yemen
Abstract
Sentiment analysis is held to be one of the highly dynamic recent research fields in Natural Language Processing, facilitated by the quickly growing volume of Web opinion data. Most of the approaches in this field are focused on English due to the lack of sentiment resources in other languages such as the Arabic language and its large variety of dialects. In most sentiment analysis applications, good sentiment resources play a critical role. Based on that, in this article, several publicly available sentiment analysis resources for Arabic are introduced. This article introduces the Arabic senti-lexicon, a list of 3880 positive and negative synsets annotated with their part of speech, polarity scores, dialects synsets and inflected forms. This article also presents a Multi-domain Arabic Sentiment Corpus (MASC) with a size of 8860 positive and negative reviews from different domains. In this article, an in-depth study has been conducted on five types of feature sets for exploiting effective features and investigating their effect on performance of Arabic sentiment analysis. The aim is to assess the quality of the developed language resources and to integrate different feature sets and classification algorithms to synthesise a more accurate sentiment analysis method. The Arabic senti-lexicon is used for generating feature vectors. Five well-known machine learning algorithms: naïve Bayes, k-nearest neighbours, support vector machines (SVMs), logistic linear regression and neural network are employed as base-classifiers for each of the feature sets. A wide range of comparative experiments on standard Arabic data sets were conducted, discussion is presented and conclusions are drawn. The experimental results show that the Arabic senti-lexicon is a very useful resource for Arabic sentiment analysis. Moreover, results show that classifiers which are trained on feature vectors derived from the corpus using the Arabic sentiment lexicon are more accurate than classifiers trained using the raw corpus.
Subject
Library and Information Sciences,Information Systems
Cited by
60 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Challenges and Opportunities of Text-Based Emotion Detection: A Survey;IEEE Access;2024
2. Hands-Free Technology: Acceptance Model for Audiobooks;Contributions to Environmental Sciences & Innovative Business Technology;2024
3. Arabic Sentiment Analysis of Mobile Banking Services Reviews;2023 Tenth International Conference on Social Networks Analysis, Management and Security (SNAMS);2023-11-21
4. Emotional Expression and Information Communication in English Texts Based on Artificial Intelligence Technology;Applied Mathematics and Nonlinear Sciences;2023-11-08
5. Arabic Sentiment Analysis of Food Delivery Services Reviews;2023 International Symposium on Networks, Computers and Communications (ISNCC);2023-10-23