Overall error analysis for the training of deep neural networks via stochastic gradient descent with random initialisation
From MaRDI portal
Publication:6107984
DOI10.1016/j.amc.2023.127907arXiv2003.01291OpenAlexW3009420624MaRDI QIDQ6107984
Publication date: 29 June 2023
Published in: Applied Mathematics and Computation (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2003.01291
strong convergenceapproximationoptimisationstochastic gradient descentdeep learninggeneralisationdeep neural networksempirical risk minimisationfull error analysisrandom initialisation
Artificial intelligence (68Txx) Numerical methods for partial differential equations, initial value and time-dependent initial-boundary value problems (65Mxx) Approximations and expansions (41Axx)
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations
- Tractability of multivariate problems. Volume I: Linear information
- Bounds for the ratio of two gamma functions
- Tractability of multivariate problems. Volume II: Standard information for functionals.
- Approximation and estimation bounds for artificial neural networks
- Multilayer feedforward networks are universal approximators
- General multilevel adaptations for stochastic approximation algorithms of Robbins-Monro and Polyak-Ruppert type
- Provable approximation properties for deep neural networks
- A distribution-free theory of nonparametric regression
- Degree of approximation by neural and translation networks with a single hidden layer
- Approximation of functions and their derivatives: A neural network implementation with applications
- Exponential convergence of the deep neural network approximation for analytic functions
- DGM: a deep learning algorithm for solving partial differential equations
- Proof that deep artificial neural networks overcome the curse of dimensionality in the numerical approximation of Kolmogorov partial differential equations with constant diffusion and nonlinear drift coefficients
- DNN expression rate analysis of high-dimensional PDEs: application to option pricing
- Deep neural network approximations for solutions of PDEs based on Monte Carlo algorithms
- On the approximation by single hidden layer feedforward neural networks with fixed weights
- Optimal approximation of piecewise smooth functions using deep ReLU neural networks
- Gradient descent optimizes over-parameterized deep ReLU networks
- Nonlinear approximation via compositions
- A comparative analysis of optimization and generalization properties of two-layer neural network and random feature models under gradient descent dynamics
- A proof that rectified deep neural networks overcome the curse of dimensionality in the numerical approximation of semilinear heat equations
- A priori estimates of the population risk for two-layer neural networks
- Error bounds for approximations with deep ReLU networks
- Lower error bounds for the stochastic gradient descent optimization algorithm: sharp convergence rates for slowly and fast decaying learning rates
- Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations
- Local Rademacher complexities
- Space-time error estimates for deep neural network approximations for differential equations
- On the mathematical foundations of learning
- Deep vs. shallow networks: An approximation theory perspective
- Some Elementary Inequalities Relating to the Gamma and Incomplete Gamma Function
- Universal approximation bounds for superpositions of a sigmoidal function
- Neural Networks for Localized Approximation
- Deep learning in high dimension: Neural network expression rates for generalized polynomial chaos expansions in UQ
- Strong error analysis for stochastic gradient descent optimization algorithms
- Convergence in Hölder norms with applications to Monte Carlo methods in infinite dimensions
- Optimal Approximation with Sparsely Connected Deep Neural Networks
- New Error Bounds for Deep ReLU Networks Using Sparse Grids
- Analysis of the Generalization Error: Empirical Risk Minimization over Deep Artificial Neural Networks Overcomes the Curse of Dimensionality in the Numerical Approximation of Black--Scholes Partial Differential Equations
- Full error analysis for the training of deep neural networks
- Uniform error estimates for artificial neural network approximations for heat equations
- Breaking the Curse of Dimensionality with Convex Neural Networks
- Understanding Machine Learning
- A Stochastic Approximation Method
- Approximation by superpositions of a sigmoidal function
- Error bounds for approximation with neural networks