Halting time is predictable for large models: a universality property and average-case analysis
From MaRDI portal
Publication:2697399
DOI10.1007/s10208-022-09554-yOpenAlexW3033781806MaRDI QIDQ2697399
Elliot Paquette, Fabian Pedregosa, Bart van Merriënboer, Courtney Paquette
Publication date: 12 April 2023
Published in: Foundations of Computational Mathematics (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2006.04299
Artificial neural networks and deep learning (68T07) Random matrices (probabilistic aspects) (60B20) Convex programming (90C25) Large-scale problems in mathematical programming (90C06) Numerical optimization and variational techniques (65K10)
Related Items
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems
- Smooth strongly convex interpolation and exact worst-case performance of first-order methods
- Anisotropic local laws for random matrices
- Large complex correlated Wishart matrices: fluctuations and asymptotic independence at the edges
- Chebyshev semi-iterative methods, successive overrelaxation iterative methods, and second order Richardson iterative methods. I, II
- Random matrices: The distribution of the smallest singular values
- Spectral analysis of large dimensional random matrices
- No eigenvalues outside the support of the limiting spectral distribution of large-dimensional sample covariance matrices
- Introductory lectures on convex optimization. A basic course.
- Exact separation of eigenvalues of large dimensional sample covariance matrices
- A random matrix approach to neural networks
- Universality in numerical computation with random data: case studies, analytical results and some speculations
- CLT for linear spectral statistics of large-dimensional sample covariance matrices.
- The Riemann--Hilbert approach to strong asymptotics for orthogonal polynomials on [-1,1]
- Smoothed analysis for the conjugate gradient algorithm
- A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights
- Universality in numerical computations with random data
- On the average number of steps of the simplex method of linear programming
- Polynomial Based Iteration Methods for Symmetric Linear Systems
- How long does it take to compute the eigenvalues of a random symmetric matrix?
- Random matrix theory
- Smoothed Analysis of the Condition Numbers and Growth Factors of Matrices
- Average-Case Stability of Gaussian Elimination
- Smoothed analysis of algorithms
- The Probability That a Numerical Analysis Problem is Difficult
- Eigenvalues and Condition Numbers of Random Matrices
- Probabilistic Models for Linear Programming
- Probability
- Universal halting times in optimization and machine learning
- Optimization Methods for Large-Scale Machine Learning
- The conjugate gradient algorithm on well-conditioned Wishart matrices is almost deterministic
- Universality in numerical computation with random data: Case studies and analytical results
- Some methods of speeding up the convergence of iteration methods
- Numerical Determination of Fundamental Modes
- Methods of conjugate gradients for solving linear systems
- Quicksort