Breaking the Curse of Dimensionality with Convex Neural Networks
From MaRDI portal
Publication:5361282
zbMath1433.68390arXiv1412.8690MaRDI QIDQ5361282
Publication date: 27 September 2017
Full work available at URL: https://arxiv.org/abs/1412.8690
Nonparametric estimation (62G05) Artificial neural networks and deep learning (68T07) Convex programming (90C25)
Related Items
Mehler’s Formula, Branching Process, and Compositional Kernels of Deep Neural Networks ⋮ A Cross-Validated Ensemble Approach to Robust Hypothesis Testing of Continuous Nonlinear Interactions: Application to Nutrition-Environment Studies ⋮ A Regularity Theory for Static Schrödinger Equations on \(\boldsymbol{\mathbb{R}}\)d in Spectral Barron Spaces ⋮ Deep learning: a statistical viewpoint ⋮ Neural network approximation ⋮ Spurious Valleys in Two-layer Neural Network Optimization Landscapes ⋮ A Proof that Artificial Neural Networks Overcome the Curse of Dimensionality in the Numerical Approximation of Black–Scholes Partial Differential Equations ⋮ Machine learning from a continuous viewpoint. I ⋮ Challenges in optimization with complex PDE-systems. Abstracts from the workshop held February 14--20, 2021 (hybrid meeting) ⋮ Thermodynamically consistent physics-informed neural networks for hyperbolic systems ⋮ Degrees of freedom for off-the-grid sparse estimation ⋮ Two-Layer Neural Networks with Values in a Banach Space ⋮ Particle dual averaging: optimization of mean field neural network with global convergence rate analysis* ⋮ Locality defeats the curse of dimensionality in convolutional teacher–student scenarios* ⋮ Relative stability toward diffeomorphisms indicates performance in deep nets* ⋮ Finite-Sample Two-Group Composite Hypothesis Testing via Machine Learning ⋮ A proof of convergence for gradient descent in the training of artificial neural networks for constant target functions ⋮ A precise high-dimensional asymptotic theory for boosting and minimum-\(\ell_1\)-norm interpolated classifiers ⋮ Sparse optimization on measures with over-parameterized gradient descent ⋮ Deep learning for constrained utility maximisation ⋮ Nonlinear Variable Selection via Deep Neural Networks ⋮ Approximation properties of deep ReLU CNNs ⋮ Uniform approximation rates and metric entropy of shallow neural networks ⋮ What Kinds of Functions Do Deep Neural Networks Learn? Insights from Variational Spline Theory ⋮ Benign overfitting in linear regression ⋮ Unnamed Item ⋮ Deep ReLU Networks Overcome the Curse of Dimensionality for Generalized Bandlimited Functions ⋮ Nonconvex regularization for sparse neural networks ⋮ Full error analysis for the training of deep neural networks ⋮ Two Steps at a Time---Taking GAN Training in Stride with Tseng's Method ⋮ Learning the mapping \(\mathbf{x}\mapsto \sum\limits_{i=1}^d x_i^2\): the cost of finding the needle in a haystack ⋮ Bayesian Imaging Using Plug & Play Priors: When Langevin Meets Tweedie ⋮ Training Neural Networks as Learning Data-adaptive Kernels: Provable Representation and Approximation Benefits ⋮ Conditional regression for single-index models ⋮ Piecewise linear functions representable with infinite width shallow ReLU neural networks ⋮ Rates of approximation by ReLU shallow neural networks ⋮ Probabilistic partition of unity networks for high‐dimensional regression problems ⋮ DEEP EQUILIBRIUM NETS ⋮ Deep Learning for Marginal Bayesian Posterior Inference with Recurrent Neural Networks ⋮ A priori generalization error analysis of two-layer neural networks for solving high dimensional Schrödinger eigenvalue problems ⋮ Quantum Monte Carlo for economics: stress testing and macroeconomic deep learning ⋮ Function approximation by deep networks ⋮ Overall error analysis for the training of deep neural networks via stochastic gradient descent with random initialisation ⋮ Convergence rates of gradient methods for convex optimization in the space of measures ⋮ A mathematical perspective of machine learning ⋮ Unnamed Item ⋮ Unnamed Item ⋮ The geometry of off-the-grid compressed sensing ⋮ Self-Supervised Deep Learning for Image Reconstruction: A Langevin Monte Carlo Approach ⋮ Estimating Optimal Infinite Horizon Dynamic Treatment Regimes via pT-Learning ⋮ On maximum a posteriori estimation with Plug \& Play priors and stochastic gradient descent ⋮ Efficient Global Optimization of Two-Layer ReLU Networks: Quadratic-Time Algorithms and Adversarial Training ⋮ Gradient descent on infinitely wide neural networks: global convergence and generalization ⋮ Greedy training algorithms for neural networks and applications to PDEs ⋮ Nonparametric regression using deep neural networks with ReLU activation function ⋮ ExSpliNet: An interpretable and expressive spline-based neural network ⋮ Kolmogorov width decay and poor approximators in machine learning: shallow neural networks, random feature models and neural tangent kernels ⋮ Landscape and training regimes in deep learning ⋮ Deep Neural Networks Algorithms for Stochastic Control Problems on Finite Horizon: Convergence Analysis ⋮ Unnamed Item ⋮ Machine Learning and Computational Mathematics ⋮ Analysis of a two-layer neural network via displacement convexity ⋮ Ensemble feature selection using election methods and ranker clustering ⋮ Unnamed Item ⋮ Topological properties of the set of functions generated by neural networks of fixed size ⋮ Linearized two-layers neural networks in high dimension ⋮ High-dimensional index volatility models via Stein's identity ⋮ Error bounds for deep ReLU networks using the Kolmogorov-Arnold superposition theorem ⋮ Fast generalization error bound of deep learning without scale invariance of activation functions ⋮ Optimally weighted loss functions for solving PDEs with neural networks ⋮ Fitting small piece-wise linear neural network models to interpolate data sets ⋮ Supervised learning from noisy observations: combining machine-learning techniques with data assimilation ⋮ Multikernel Regression with Sparsity Constraint ⋮ The Gap between Theory and Practice in Function Approximation with Deep Neural Networks ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Asymptotic learning curves of kernel methods: empirical data versus teacher–student paradigm ⋮ Unnamed Item ⋮ Unnamed Item ⋮ CAS4DL: Christoffel adaptive sampling for function approximation via deep learning ⋮ Understanding neural networks with reproducing kernel Banach spaces ⋮ The interpolation phase transition in neural networks: memorization and generalization under lazy training ⋮ On the Effectiveness of Richardson Extrapolation in Data Science ⋮ When do neural networks outperform kernel methods?* ⋮ Approximation Error Analysis of Some Deep Backward Schemes for Nonlinear PDEs ⋮ Representation formulas and pointwise properties for Barron functions ⋮ Deep ReLU neural networks overcome the curse of dimensionality for partial integrodifferential equations ⋮ Approximating functions with multi-features by deep convolutional neural networks ⋮ The Barron space and the flow-induced function spaces for neural network models ⋮ Robust and resource-efficient identification of two hidden layer neural networks ⋮ High-order approximation rates for shallow neural networks with cosine and \(\mathrm{ReLU}^k\) activation functions