Pages that link to "Item:Q828491"
From MaRDI portal
The following pages link to On large batch training and sharp minima: a Fokker-Planck perspective (Q828491):
Displaying 3 items.
- On the diffusion approximation of nonconvex stochastic gradient descent (Q1734292) (← links)
- An empirical study into finding optima in stochastic optimization of neural networks (Q2127118) (← links)
- The inverse variance–flatness relation in stochastic gradient descent is critical for finding flat minima (Q5073270) (← links)