Deep networks on toroids: removing symmetries reveals the structure of flat regions in the landscape geometry*
From MaRDI portal
Publication:5055419
DOI10.1088/1742-5468/AC9832OpenAlexW4309880056MaRDI QIDQ5055419
Antonio Ferraro, Riccardo Zecchina, Gabriele Perugini, Carlo Baldassi, Christoph Feinauer, Fabrizio Pittorino
Publication date: 13 December 2022
Published in: Journal of Statistical Mechanics: Theory and Experiment (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2202.03038
Uses Software
Cites Work
- Flat Minima
- Entropic gradient descent algorithms and wide flat minima*
- The inverse variance–flatness relation in stochastic gradient descent is critical for finding flat minima
- Reconciling modern machine-learning practice and the classical bias–variance trade-off
- Entropy-SGD: biasing gradient descent into wide valleys
- Shaping the learning landscape in neural networks around wide flat minima
This page was built for publication: Deep networks on toroids: removing symmetries reveals the structure of flat regions in the landscape geometry*