Cross-corpus speech emotion recognition using subspace learning and domain adaption-Reference-Cited by-全球学者库

Cross-corpus speech emotion recognition using subspace learning and domain adaption

Published:2022-12-27 Issue:1 Volume:2022 Page:
ISSN:1687-4722
Container-title:EURASIP Journal on Audio, Speech, and Music Processing
language:en
Short-container-title:J AUDIO SPEECH MUSIC PROC.

Author:

Cao Xuan,Jia Maoshen^ORCID,Ru Jiawei,Pai Tun-wen

Abstract

AbstractSpeech emotion recognition (SER) is a hot topic in speech signal processing. When the training data and the test data come from different corpus, their feature distributions are different, which leads to the degradation of the recognition performance. Therefore, in order to solve this problem, a cross-corpus speech emotion recognition method is proposed based on subspace learning and domain adaptation in this paper. Specifically, training set data and the test set data are used to form the source domain and target domain, respectively. Then, the Hessian matrix is introduced to obtain the subspace for the extracted features in both source and target domains. In addition, an information entropy-based domain adaption method is introduced to construct the common space. In the common space, the difference between the feature distributions in the source domain and target domain is reduced as much as possible. To evaluate the performance of the proposed method, extensive experiments are conducted on cross-corpus speech emotion recognition. Experimental results show that the proposed method achieves better performance compared with some existing subspace learning and domain adaptation methods.

Funder

National Natural Science Foundation of China

Publisher

Springer Science and Business Media LLC

Subject

Electrical and Electronic Engineering,Acoustics and Ultrasonics

Link

https://link.springer.com/content/pdf/10.1186/s13636-022-00264-5.pdf

Reference65 articles.

1. S. Zhao, G. Jia, J. Yang, G. Ding, K. Keutzer, Emotion recognition from multiple modalities: fundamentals and methodologies. IEEE Sign. Process. Magazine 38(6), 59–73 (2021)

2. X. Wu, S. Hu, Z. Wu, X. Liu, H. Meng, in 2022 IEEE International Conference on Acoustics, Speech, and Signal Processing (IEEE ICASSP). Neural architecture search for speech emotion recognition (2022), pp. 1–4

3. C.-C. Lee, K. Sridhar, J.-L. Li, W.-C. Lin, S. Bo-Hao, C. Busso, Deep representation learning for affective speech signal analysis and processing: preventing unwanted signal disparities. IEEE Sign. Process. Magazine 38(6), 22–38 (2021)

4. J.S. Gómez-Cañón, E. Cano, T. Eerola, P. Herrera, H. Xiao, Y.-H. Yang, E. Gómez, Music emotion recognition: toward new, robust standards in personalized and context-sensitive applications. IEEE Sign. Process. Magazine 38(6), 106–114 (2021)

5. W. Chung-Hsien, W.-B. Liang, Emotion recognition of affective speech based on multiple classifiers using acoustic-prosodic information and semantic labels. IEEE Trans. Affect. Comput. 2(1), 10–21 (2011)

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Design of smart home system speech emotion recognition model based on ensemble deep learning and feature fusion;Applied Acoustics;2024-03

2. Paralinguistic and spectral feature extraction for speech emotion classification using machine learning techniques;EURASIP Journal on Audio, Speech, and Music Processing;2023-05-15