Author:
Duranthon O,Marsili M,Xie R
Abstract
Abstract
We explore the hypothesis that learning machines extract representations of maximal relevance, where the relevance is defined as the entropy of the energy distribution of the internal representation. We show that the mutual information between the internal representation of a learning machine and the features that it extracts from the data is bounded from below by the relevance. This motivates our study of models with maximal relevance—that we call optimal learning machines—as candidates of maximally informative representations. We analyse how the maximisation of the relevance is constrained both by the architecture of the model used and by the available data, in practical cases. We find that sub-extensive features that do not affect the thermodynamics of the model, may affect significantly learning performance, and that criticality enhances learning performance, but the existence of a critical point is not a necessary condition. On specific learning tasks, we find that (i) the maximal values of the likelihood are achieved by models with maximal relevance, (ii) internal representations approach the maximal relevance that can be achieved in a finite dataset and (iii) learning is associated with a broadening of the spectrum of energy levels of the internal representation, in agreement with the maximum relevance hypothesis.
Subject
Statistics, Probability and Uncertainty,Statistics and Probability,Statistical and Nonlinear Physics
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. A simple probabilistic neural network for machine understanding;Journal of Statistical Mechanics: Theory and Experiment;2024-02-20
2. Simplicity science;Indian Journal of Physics;2024-02-01
3. Learning Interacting Theories from Data;Physical Review X;2023-11-20
4. Multiscale relevance of natural images;Scientific Reports;2023-09-09
5. A random energy approach to deep learning;Journal of Statistical Mechanics: Theory and Experiment;2022-07-01