On Human-like Biases in Convolutional Neural Networks for the Perception of Slant from Texture-Reference-Cited by-同舟云学术

On Human-like Biases in Convolutional Neural Networks for the Perception of Slant from Texture

Published:2023-10-25 Issue:4 Volume:20 Page:1-18
ISSN:1544-3558
Container-title:ACM Transactions on Applied Perception
language:en
Short-container-title:ACM Trans. Appl. Percept.

Author:

Wang Yuanhao¹^ORCID,Zhang Qian¹^ORCID,Aubuchon Celine²^ORCID,Kemp Jovan²^ORCID,Domini Fulvio²^ORCID,Tompkin James¹^ORCID

Affiliation:

1. Brown University Department of Computer Science, USA

2. Brown University Department of Cognitive, Linguistic, and Psychological Sciences, USA

Abstract

Depth estimation is fundamental to 3D perception, and humans are known to have biased estimates of depth. This study investigates whether convolutional neural networks (CNNs) can be biased when predicting the sign of curvature and depth of surfaces of textured surfaces under different viewing conditions (field of view) and surface parameters (slant and texture irregularity). This hypothesis is drawn from the idea that texture gradients described by local neighborhoods—a cue identified in human vision literature—are also representable within convolutional neural networks. To this end, we trained both unsupervised and supervised CNN models on the renderings of slanted surfaces with random Polka dot patterns and analyzed their internal latent representations. The results show that the unsupervised models have similar prediction biases as humans across all experiments, while supervised CNN models do not exhibit similar biases. The latent spaces of the unsupervised models can be linearly separated into axes representing field of view and optical slant. For supervised models, this ability varies substantially with model architecture and the kind of supervision (continuous slant vs. sign of slant). Even though this study says nothing of any shared mechanism, these findings suggest that unsupervised CNN models can share similar predictions to the human visual system. Code: github.com/brownvc/Slant-CNN-Biases.

Funder

NSF

NIH

Publisher

Association for Computing Machinery (ACM)

Subject

Experimental and Cognitive Psychology,General Computer Science,Theoretical Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3613451

Reference28 articles.

1. Explicit and implicit depth-cue integration: Evidence of systematic biases with real objects

2. 3-D structure perceived from dynamic information: a new theory

3. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness;Geirhos Robert;arXiv preprint arXiv:1811.12231,2018

4. James J. Gibson. 1950a. The Perception of the Visual World . (1950).

5. The Perception of Visual Surfaces