Abstract
With the continuous development of neuromorphic sensors and spiking neural networks, there is increasing attention on event-driven perception learning in both vision and tactile domains. However, due to the limited information representation capability of existing spiking neurons and the high spatio-temporal complexity of event-driven visual and tactile data, we focus on exploring the application potential of visual and tactile perception in event-driven datasets. We propose an innovative spiking neural network method for integrating visual and tactile perception, aiming to significantly enhance the perceptual and information integration capabilities of the fusion network. Our approach enables the extraction of features from both time and position dimensions, thereby more effectively capturing the spatio-temporal dependencies in event data. Additionally, we introduce a weighted spike loss function to optimize model performance and meet specific task requirements. Experimental results demonstrate that the proposed visual and tactile fusion spiking neural network achieves superior performance compared to baseline algorithms on object recognition, container detection, and slip detection datasets.