A Dynamic Position Embedding-Based Model for Student Classroom Complete Meta-Action Recognition-Reference-Cited by-同舟云学术

A Dynamic Position Embedding-Based Model for Student Classroom Complete Meta-Action Recognition

Published:2024-08-20 Issue:16 Volume:24 Page:5371
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Shou Zhaoyu¹²^ORCID,Yuan Xiaohu¹,Li Dongxu¹,Mo Jianwen¹^ORCID,Zhang Huibing³,Zhang Jingwei⁴^ORCID,Wu Ziyong⁴

Affiliation:

1. School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China

2. Guangxi Wireless Broadband Communication and Signal Processing Key Laboratory, Guilin University of Electronic Technology, Guilin 541004, China

3. School of Computer and Information Security, Guilin University of Electronic Technology, Guilin 541004, China

4. Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, Guilin 541004, China

Abstract

The precise recognition of entire classroom meta-actions is a crucial challenge for the tailored adaptive interpretation of student behavior, given the intricacy of these actions. This paper proposes a Dynamic Position Embedding-based Model for Student Classroom Complete Meta-Action Recognition (DPE-SAR) based on the Video Swin Transformer. The model utilizes a dynamic positional embedding technique to perform conditional positional encoding. Additionally, it incorporates a deep convolutional network to improve the parsing ability of the spatial structure of meta-actions. The full attention mechanism of ViT3D is used to extract the potential spatial features of actions and capture the global spatial–temporal information of meta-actions. The proposed model exhibits exceptional performance compared to baseline models in action recognition as observed in evaluations on public datasets and smart classroom meta-action recognition datasets. The experimental results confirm the superiority of the model in meta-action recognition.

Funder

National Natural Science Foundation of China

Guangxi Natural Science Foundation

Project of Guangxi Wireless Broadband Communication and Signal Processing Key Laboratory

Innovation Project of Guangxi Graduate Education

Project for Improving the Basic Scientific Research Abilities of Young and Middle-aged Teachers in Guangxi Colleges and Universities

Publisher

MDPI AG

Link

https://www.mdpi.com/1424-8220/24/16/5371/pdf

Reference29 articles.

1. Shou, Z., Yan, M., Wen, H., Liu, J., Mo, J., and Zhang, H. (2023). Research on Students’ Action Behavior Recognition Method Based on Classroom Time-Series Images. Appl. Sci., 13.

2. Lin, F.C., Ngo, H.H., Dow, C.R., Lam, K.H., and Le, H.L. (2021). Student behavior recognition system for the classroom environment based on skeleton pose estimation and person detection. Sensors, 21.

3. Chen, Z., Huang, W., Liu, H., Wang, Z., Wen, Y., and Wang, S. (2024). ST-TGR: Spatio-Temporal Representation Learning for Skeleton-Based Teaching Gesture Recognition. Sensors, 24.

4. Human action recognition using attention based LSTM network with dilated CNN features;Muhammad;Future Gener. Comput. Syst.,2021

5. Liu, Z., Ning, J., Cao, Y., Wei, Y., Zhang, Z., Lin, S., and Hu, H. (2022, January 18–24). Video Swin Transformer. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.