Speech recognition with a hearing-aid processing scheme combining beamforming with mask-informed speech enhancement

Author:

Green Tim1ORCID,Hilkhuysen Gaston1,Huckvale Mark1,Rosen Stuart1,Brookes Mike2,Moore Alastair2,Naylor Patrick2,Lightburn Leo2,Xue Wei2

Affiliation:

1. Department of Speech, Hearing and Phonetic Sciences, UCL, London, UK

2. Department of Electrical and Electronic Engineering, Imperial College, London, UK

Abstract

A signal processing approach combining beamforming with mask-informed speech enhancement was assessed by measuring sentence recognition in listeners with mild-to-moderate hearing impairment in adverse listening conditions that simulated the output of behind-the-ear hearing aids in a noisy classroom. Two types of beamforming were compared: binaural, with the two microphones of each aid treated as a single array, and bilateral, where independent left and right beamformers were derived. Binaural beamforming produces a narrower beam, maximising improvement in signal-to-noise ratio (SNR), but eliminates the spatial diversity that is preserved in bilateral beamforming. Each beamformer type was optimised for the true target position and implemented with and without additional speech enhancement in which spectral features extracted from the beamformer output were passed to a deep neural network trained to identify time-frequency regions dominated by target speech. Additional conditions comprising binaural beamforming combined with speech enhancement implemented using Wiener filtering or modulation-domain Kalman filtering were tested in normally-hearing (NH) listeners. Both beamformer types gave substantial improvements relative to no processing, with significantly greater benefit for binaural beamforming. Performance with additional mask-informed enhancement was poorer than with beamforming alone, for both beamformer types and both listener groups. In NH listeners the addition of mask-informed enhancement produced significantly poorer performance than both other forms of enhancement, neither of which differed from the beamformer alone. In summary, the additional improvement in SNR provided by binaural beamforming appeared to outweigh loss of spatial information, while speech understanding was not further improved by the mask-informed enhancement method implemented here.

Funder

Engineering and Physical Sciences Research Council

Publisher

SAGE Publications

Subject

Speech and Hearing,Otorhinolaryngology

Cited by 6 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. ecVoice: Audio Text Extraction Optimization of Video Based on Idioms Similarity Replacement;2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC);2023-10-31

2. Monaural speech separation using WT-Conv-TasNet for hearing aids;International Journal of Speech Technology;2023-09

3. Speech Enhancement: A Survey of Approaches and Applications;2023 2nd International Conference on Edge Computing and Applications (ICECAA);2023-07-19

4. Single-Channel Speech Enhancement Using Single Dimension Change Accelerated Particle Swarm Optimization for Subspace Partitioning;Circuits, Systems, and Signal Processing;2023-03-01

5. Acoustic Source Tracking Based on Probabilistic Data Association and Distributed Cubature Kalman Filtering in Acoustic Sensor Networks;Sensors;2022-09-21

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3