Stochastic gradient descent with noise of machine learning type. II: Continuous time analysis
From MaRDI portal
Publication:6188971
DOI10.1007/s00332-023-09992-0arXiv2106.02588OpenAlexW3166627127MaRDI QIDQ6188971
Publication date: 12 January 2024
Published in: Journal of Nonlinear Science (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2106.02588
stochastic differential equationnonconvex optimizationoverparametrizationmachine learningdegenerate diffusion equationstochastic gradient descentdeep learninginvariant distributionimplicit biasPoincaré-Hardy inequalityglobal minimum selectionflat minimum selection
Artificial neural networks and deep learning (68T07) Nonconvex programming, global optimization (90C26) Degenerate parabolic equations (35K65) Applications of stochastic analysis (to PDEs, etc.) (60H30)
Cites Work
- Unnamed Item
- Unnamed Item
- Regularity theory for general stable operators
- Improved Poincaré inequalities
- Optimal control of stochastic differential equations via Fokker-Planck equations
- Functional analysis, Sobolev spaces and partial differential equations
- Elliptic partial differential equations of second order
- Analysis of a two-layer neural network via displacement convexity
- Analysis of stochastic gradient descent in continuous time
- Mean-field Langevin dynamics and energy landscape of neural networks
- Regularity theory for general stable operators: parabolic equations
- Rectifiable sets, densities and tangent measures
- Sub-Laplacian eigenvalue bounds on sub-Riemannian manifolds
- The Variational Formulation of the Fokker--Planck Equation
- Stochastic Gradient Descent in Continuous Time
- On the Heat Diffusion for Generic Riemannian and Sub-Riemannian Structures
- Sharp rates of decay of solutions to the nonlinear fast diffusion equation via functional inequalities
- A Comprehensive Introduction to Sub-Riemannian Geometry
- A mean field view of the landscape of two-layer neural networks
- Stochastic Gradient Descent in Continuous Time: A Central Limit Theorem
- Mean Field Analysis of Neural Networks: A Law of Large Numbers
- About the Hardy Inequality
- Bounds for the Discrete Part of the Spectrum of a Semi-Bounded Schrödinger Operator.
- A Liouville Theorem for Degenerate Elliptic Equations
- A Stochastic Approximation Method
- Wahrscheinlichkeitstheorie
- Stochastic gradient descent with noise of machine learning type. I: Discrete time analysis