Abstract
AbstractThe human auditory system extracts rich linguistic abstractions from speech signals. Traditional approaches to understanding this complex process have used linear feature-encoding models, with limited success. Artificial neural networks excel in speech recognition tasks and offer promising computational models of speech processing. We used speech representations in state-of-the-art deep neural network (DNN) models to investigate neural coding from the auditory nerve to the speech cortex. Representations in hierarchical layers of the DNN correlated well with the neural activity throughout the ascending auditory system. Unsupervised speech models performed at least as well as other purely supervised or fine-tuned models. Deeper DNN layers were better correlated with the neural activity in the higher-order auditory cortex, with computations aligned with phonemic and syllabic structures in speech. Accordingly, DNN models trained on either English or Mandarin predicted cortical responses in native speakers of each language. These results reveal convergence between DNN model representations and the biological auditory pathway, offering new approaches for modeling neural coding in the auditory cortex.
Funder
U.S. Department of Health & Human Services | NIH | National Institute of Neurological Disorders and Stroke
William K. Bowes, Jr. Foundation
Shurl and Kay Curci Foundation
U.S. Department of Health & Human Services | NIH | National Institute on Deafness and Other Communication Disorders
Science and Technology Commission of Shanghai Municipality
Shanghai Municipal Health Bureau
Shanghai Shen Kang Hospital Development Center
Publisher
Springer Science and Business Media LLC
Reference74 articles.
1. Cooper, F. S., Delattre, P. C., Liberman, A. M., Borst, J. M. & Gerstman, L. J. Some experiments on the perception of synthetic speech sounds. J. Acoust. Soc. Am. 24, 597–606 (1952).
2. Liberman, A. M., Cooper, F. S., Shankweiler, D. P. & Studdert-Kennedy, M. Perception of the speech code. Psychol. Rev. 74, 431–461 (1967).
3. Stevens, K. N. & Blumstein, S. E. The search for invariant acoustic correlates of phonetic features. in Perspectives on the Study of Speech (eds. Eimas, P. D. & Miller, J. L.) 1–38 (Psychology Press, 1981).
4. Allen, E. J., Burton, P. C., Olman, C. A. & Oxenham, A. J. Representations of pitch and timbre variation in human auditory cortex. J. Neurosci. 37, 1284–1293 (2017).
5. Ladefoged, P. & Johnson, K. A Course in Phonetics (Cengage Learning, 2014).
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献