Environmental sound classification using temporal-frequency attention based convolutional neural network-Reference-Cited by-同舟云学术

Environmental sound classification using temporal-frequency attention based convolutional neural network

Published:2021-11-03 Issue:1 Volume:11 Page:
ISSN:2045-2322
Container-title:Scientific Reports
language:en
Short-container-title:Sci Rep

Author:

Mu Wenjie,Yin Bo,Huang Xianqing,Xu Jiali,Du Zehua

Abstract

AbstractEnvironmental sound classification is one of the important issues in the audio recognition field. Compared with structured sounds such as speech and music, the time–frequency structure of environmental sounds is more complicated. In order to learn time and frequency features from Log-Mel spectrogram more effectively, a temporal-frequency attention based convolutional neural network model (TFCNN) is proposed in this paper. Firstly, an experiment that is used as motivation in proposed method is designed to verify the effect of a specific frequency band in the spectrogram on model classification. Secondly, two new attention mechanisms, temporal attention mechanism and frequency attention mechanism, are proposed. These mechanisms can focus on key frequency bands and semantic related time frames on the spectrogram to reduce the influence of background noise and irrelevant frequency bands. Then, a feature information complementarity is formed by combining these mechanisms to more accurately capture the critical time–frequency features. In such a way, the representation ability of the network model can be greatly improved. Finally, experiments on two public data sets, UrbanSound 8 K and ESC-50, demonstrate the effectiveness of the proposed method.

Funder

National Natural Science Foundation of China

Key R & D projects of Shandong Province

Publisher

Springer Science and Business Media LLC

Subject

Multidisciplinary

Link

https://www.nature.com/articles/s41598-021-01045-4.pdf

Reference41 articles.

1. Baum, E., Harper, M., Alicea, R., et al. Sound identification for fire-fighting mobile robots. IEEE International Conference on Robotic Computing. IEEE Computer Society, pp. 79–86 (2018).

2. Wang, J. C. et al. Robust environmental sound recognition for home automation. IEEE Trans. Autom. Sci. Eng. 5(1), 25–31 (2008).

3. Radhakrishnan, R., Divakaran, A., Smaragdis, A. Audio analysis for surveillance applications. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, IEEE, pp.158–161 (2005).

4. Sainath, J.P., Salamon, J., Jacoby, C. A dataset and Taxonomy for Urban Sound Research. 22nd ACM International Conference on Multimedia. (ACM Press, 2014).

5. Bountourakis, V., Vrysis, L., Papanikolaou, G. Machine learning algorithms for environmental sound recognition: Towards soundscape semantics. ACM Int Conf Proc Ser, pp. 1–7 (2015).

Cited by 50 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Efficient learning in spiking neural networks;Neurocomputing;2024-09

2. Using Deep Learning to Classify Environmental Sounds in the Habitat of Western Black-Crested Gibbons;Diversity;2024-08-22

3. Enhancing Emergency Vehicle Detection: A Deep Learning Approach with Multimodal Fusion;Mathematics;2024-05-13

4. A novel approach to build a low complexity smart sound recognition system for domestic environment;Applied Acoustics;2024-05

5. TF-SepNet: An Efficient 1D Kernel Design in Cnns for Low-Complexity Acoustic Scene Classification;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14