The Psychometrics of Automatic Speech Recognition-Reference-Cited by-同舟云学术

The Psychometrics of Automatic Speech Recognition

Published:2021-04-20 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Weerts Lotte^ORCID,Rosen Stuart^ORCID,Clopath Claudia,Goodman Dan F. M.^ORCID

Abstract

Automatic speech recognition (ASR) software has been suggested as a candidate model of the human auditory system thanks to recent dramatic improvements in performance. To test this hypothesis, we compared several state-of-the-art ASR systems to results from humans on a barrage of standard psychometric experiments. While some systems showed qualitative agreement with humans in certain tests, in others all tested systems diverged markedly from humans. In particular, all systems used spectral invariance, temporal fine structure and speech periodicity differently from humans. We conclude that none of the tested ASR systems can yet act as a strong proxy for human speech recognition. However, we note that the more recent systems with better performance also tend to better match human results, suggesting that continued cross-fertilisation of ideas between human and automatic speech recognition may be fruitful. Our open source toolbox allows researchers to assess future ASR systems or add additional psychoacoustic measures.

Publisher

Cold Spring Harbor Laboratory

Reference54 articles.

1. Arai T , Greenberg S . Speech intelligibility in the presence of cross-channel spectral asynchrony. In: 1998 IEEE International conference on Acoustics, Speech and Signal Processing, (ICASSP), vol. 2 IEEE; 1998. p. 933–936.

2. Syllable intelligibility for temporally filtered LPC cepstral trajectories

3. Baevski A , Zhou H , Mohamed A , Auli M . wav2vec 2.0: A framework for self-supervised learning of speech representations. arXiv preprint arXiv:200611477. 2020;.

4. Boersma P . Praat: doing phonetics by computer. http://www.praatorg/. 2021;.

5. Identification of concurrent harmonic and inharmonic vowels: A test of the theory of harmonic cancellation and enhancement

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Models optimized for real-world tasks reveal the necessity of precise temporal coding in hearing;2024-04-25

2. Convenience vs. Reliability? Evaluation of Human-Robot Interaction Preferences in a Production Environment;Lecture Notes in Computer Science;2024

3. Intelligibility of speech in Parkinson's disease relies on anatomically segregated subthalamic beta oscillations;Neurobiology of Disease;2023-09

4. Employing Deep Learning Model to Evaluate Speech Information in Acoustic Simulations of Auditory Implants;2023-06-29

5. Employing Deep Learning Model to Evaluate Speech Information in Vocoder Simulations of Auditory Implants;2023-05-24