A DENSE SPATIAL NETWORK MODEL FOR EMOTION RECOGNITION USING LEARNING APPROACHES-Reference-Cited by-同舟云学术

A DENSE SPATIAL NETWORK MODEL FOR EMOTION RECOGNITION USING LEARNING APPROACHES

Published:2024-08-10 Issue: Volume: Page:
ISSN:2375-4699
Container-title:ACM Transactions on Asian and Low-Resource Language Information Processing
language:en
Short-container-title:ACM Trans. Asian Low-Resour. Lang. Inf. Process.

Author:

V Lakshmi Lalitha¹^ORCID,Anguraj Dinesh Kumar¹^ORCID

Affiliation:

1. Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Guntur, India

Abstract

Researchers are increasingly eager to develop techniques to extract emotional data from new sources due to the exponential growth of subjective information on Web 2.0. One of the most challenging aspects of textual emotion detection is the collection of data with emotion labels, given the subjectivity involved in labeling emotions. To address this significant issue, our research aims to aid in the development of effective solutions. We propose a Deep Convolutional Belief-based Spatial Network Model (DCB-SNM) as a semi-automated technique to tackle this challenge. This model involves two basic phases of analysis: text and video. In this process, pre-trained annotators identify the dominant emotion. Our work evaluates the impact of this automatic pre-annotation approach on manual emotion annotation from the perspectives of annotation time and agreement. The data on annotation time indicates an increase of roughly 20% when the pre-annotation procedure is utilized, without negatively affecting the annotators' skill. This demonstrates the benefits of pre-annotation approaches. Additionally, pre-annotation proves to be particularly advantageous for contributors with low prediction accuracy, enhancing overall annotation efficiency and reliability.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3688000

Reference30 articles.

1. A survey on sentiment analysis and opinion mining for social multimedia

2. Chen, X. Li, Q. Jin, S. Zhang, and Y. Qin, ''Video emotion recognition in the wild based on the fusion of multimodal features,'' in Proc. 18th ACM Int. Conf. Multimodal Interact., Oct. 2016, pp. 494–500.

3. Tsai, S. Bai, P. P. Liang, J. Z. Kolter, L.-P. Morency, and R. Salakhutdinov, ‘‘Multimodal transformer for unaligned multimodal language sequences,’’ in Proc. 57th Annu. Meeting Assoc. Comput. Linguistics, 2019, p. 6558.

4. VATT: Transformers for multimodal self-supervised learning from raw video, audio, and text;Yuan L.;Proc. Adv. Neural Inf. Process. Syst.,2021

5. Dai, Z. Liu, T. Yu, and P. Fung, ‘‘Modality-transferable emotion embeddings for low-resource multimodal emotion recognition,’’ 2020, arXiv:2009.09629