Listening with generative models-Reference-Cited by-同舟云学术

Listening with generative models

Published:2023-04-28 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Cusimano Maddie^ORCID,Hewitt Luke B.,McDermott Josh H.

Abstract

AbstractPerception has long been envisioned to use an internal model of the world to explain the causes of sensory signals. However, such accounts have historically not been testable, typically requiring intractable search through the space of possible explanations. Using auditory scenes as a case study, we leveraged contemporary computational tools to infer explanations of sounds in a candidate internal model of the auditory world (ecologically inspired audio synthesizers). Model inferences accounted for many classic illusions. Unlike traditional accounts of auditory illusions, the model is applicable to any sound, and exhibited human-like perceptual organization for real world sound mixtures. The combination of stimulus-computability and interpretable model structure enabled ‘rich falsification’, revealing additional assumptions about sound generation needed to account for perception. The results show how generative models can account for the perception of both classic illusions and everyday sensory signals, and provide the basis on which to build theories of perception.

Publisher

Cold Spring Harbor Laboratory

Reference169 articles.

1. A century of Gestalt psychology in visual perception: I. Perceptual grouping and figure–ground organization.

2. A century of Gestalt psychology in visual perception: II. Conceptual and theoretical foundations.

3. The auditory organization of speech and other sources in listeners and computational models;Speech communication,2001

4. Decoding speech in the presence of other sources;Speech Communication,2005

5. A. S. Bregman , Auditory scene analysis: The perceptual organization of sound (MIT Press, 1994).