Breaking the Curse of Dimensionality with Convex Neural Networks

zbMath1433.68390arXiv1412.8690MaRDI QIDQ5361282

Francis Bach

Publication date: 27 September 2017

Full work available at URL: https://arxiv.org/abs/1412.8690

zbMATH Keywords

convex optimization neural networks convex relaxation nonparametric estimation

Mathematics Subject Classification ID

Nonparametric estimation (62G05) Artificial neural networks and deep learning (68T07) Convex programming (90C25)

Related Items

Mehler’s Formula, Branching Process, and Compositional Kernels of Deep Neural Networks ⋮ A Cross-Validated Ensemble Approach to Robust Hypothesis Testing of Continuous Nonlinear Interactions: Application to Nutrition-Environment Studies ⋮ A Regularity Theory for Static Schrödinger Equations on \(\boldsymbol{\mathbb{R}}\)^d in Spectral Barron Spaces ⋮ Deep learning: a statistical viewpoint ⋮ Neural network approximation ⋮ Spurious Valleys in Two-layer Neural Network Optimization Landscapes ⋮ A Proof that Artificial Neural Networks Overcome the Curse of Dimensionality in the Numerical Approximation of Black–Scholes Partial Differential Equations ⋮ Machine learning from a continuous viewpoint. I ⋮ Challenges in optimization with complex PDE-systems. Abstracts from the workshop held February 14--20, 2021 (hybrid meeting) ⋮ Thermodynamically consistent physics-informed neural networks for hyperbolic systems ⋮ Degrees of freedom for off-the-grid sparse estimation ⋮ Two-Layer Neural Networks with Values in a Banach Space ⋮ Particle dual averaging: optimization of mean field neural network with global convergence rate analysis* ⋮ Locality defeats the curse of dimensionality in convolutional teacher–student scenarios* ⋮ Relative stability toward diffeomorphisms indicates performance in deep nets* ⋮ Finite-Sample Two-Group Composite Hypothesis Testing via Machine Learning ⋮ A proof of convergence for gradient descent in the training of artificial neural networks for constant target functions ⋮ A precise high-dimensional asymptotic theory for boosting and minimum-\(\ell_1\)-norm interpolated classifiers ⋮ Sparse optimization on measures with over-parameterized gradient descent ⋮ Deep learning for constrained utility maximisation ⋮ Nonlinear Variable Selection via Deep Neural Networks ⋮ Approximation properties of deep ReLU CNNs ⋮ Uniform approximation rates and metric entropy of shallow neural networks ⋮ What Kinds of Functions Do Deep Neural Networks Learn? Insights from Variational Spline Theory ⋮ Benign overfitting in linear regression ⋮ Unnamed Item ⋮ Deep ReLU Networks Overcome the Curse of Dimensionality for Generalized Bandlimited Functions ⋮ Nonconvex regularization for sparse neural networks ⋮ Full error analysis for the training of deep neural networks ⋮ Two Steps at a Time---Taking GAN Training in Stride with Tseng's Method ⋮ Learning the mapping \(\mathbf{x}\mapsto \sum\limits_{i=1}^d x_i^2\): the cost of finding the needle in a haystack ⋮ Bayesian Imaging Using Plug & Play Priors: When Langevin Meets Tweedie ⋮ Training Neural Networks as Learning Data-adaptive Kernels: Provable Representation and Approximation Benefits ⋮ Conditional regression for single-index models ⋮ Piecewise linear functions representable with infinite width shallow ReLU neural networks ⋮ Rates of approximation by ReLU shallow neural networks ⋮ Probabilistic partition of unity networks for high‐dimensional regression problems ⋮ DEEP EQUILIBRIUM NETS ⋮ Deep Learning for Marginal Bayesian Posterior Inference with Recurrent Neural Networks ⋮ A priori generalization error analysis of two-layer neural networks for solving high dimensional Schrödinger eigenvalue problems ⋮ Quantum Monte Carlo for economics: stress testing and macroeconomic deep learning ⋮ Function approximation by deep networks ⋮ Overall error analysis for the training of deep neural networks via stochastic gradient descent with random initialisation ⋮ Convergence rates of gradient methods for convex optimization in the space of measures ⋮ A mathematical perspective of machine learning ⋮ Unnamed Item ⋮ Unnamed Item ⋮ The geometry of off-the-grid compressed sensing ⋮ Self-Supervised Deep Learning for Image Reconstruction: A Langevin Monte Carlo Approach ⋮ Estimating Optimal Infinite Horizon Dynamic Treatment Regimes via pT-Learning ⋮ On maximum a posteriori estimation with Plug \& Play priors and stochastic gradient descent ⋮ Efficient Global Optimization of Two-Layer ReLU Networks: Quadratic-Time Algorithms and Adversarial Training ⋮ Gradient descent on infinitely wide neural networks: global convergence and generalization ⋮ Greedy training algorithms for neural networks and applications to PDEs ⋮ Nonparametric regression using deep neural networks with ReLU activation function ⋮ ExSpliNet: An interpretable and expressive spline-based neural network ⋮ Kolmogorov width decay and poor approximators in machine learning: shallow neural networks, random feature models and neural tangent kernels ⋮ Landscape and training regimes in deep learning ⋮ Deep Neural Networks Algorithms for Stochastic Control Problems on Finite Horizon: Convergence Analysis ⋮ Unnamed Item ⋮ Machine Learning and Computational Mathematics ⋮ Analysis of a two-layer neural network via displacement convexity ⋮ Ensemble feature selection using election methods and ranker clustering ⋮ Unnamed Item ⋮ Topological properties of the set of functions generated by neural networks of fixed size ⋮ Linearized two-layers neural networks in high dimension ⋮ High-dimensional index volatility models via Stein's identity ⋮ Error bounds for deep ReLU networks using the Kolmogorov-Arnold superposition theorem ⋮ Fast generalization error bound of deep learning without scale invariance of activation functions ⋮ Optimally weighted loss functions for solving PDEs with neural networks ⋮ Fitting small piece-wise linear neural network models to interpolate data sets ⋮ Supervised learning from noisy observations: combining machine-learning techniques with data assimilation ⋮ Multikernel Regression with Sparsity Constraint ⋮ The Gap between Theory and Practice in Function Approximation with Deep Neural Networks ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Asymptotic learning curves of kernel methods: empirical data versus teacher–student paradigm ⋮ Unnamed Item ⋮ Unnamed Item ⋮ CAS4DL: Christoffel adaptive sampling for function approximation via deep learning ⋮ Understanding neural networks with reproducing kernel Banach spaces ⋮ The interpolation phase transition in neural networks: memorization and generalization under lazy training ⋮ On the Effectiveness of Richardson Extrapolation in Data Science ⋮ When do neural networks outperform kernel methods?* ⋮ Approximation Error Analysis of Some Deep Backward Schemes for Nonlinear PDEs ⋮ Representation formulas and pointwise properties for Barron functions ⋮ Deep ReLU neural networks overcome the curse of dimensionality for partial integrodifferential equations ⋮ Approximating functions with multi-features by deep convolutional neural networks ⋮ The Barron space and the flow-induced function spaces for neural network models ⋮ Robust and resource-efficient identification of two hidden layer neural networks ⋮ High-order approximation rates for shallow neural networks with cosine and \(\mathrm{ReLU}^k\) activation functions