Local Rademacher complexities and oracle inequalities in risk minimization. (2004 IMS Medallion Lecture). (With discussions and rejoinder)
From MaRDI portal
Publication:2373576
DOI10.1214/009053606000001019zbMath1118.62065arXiv0708.0083OpenAlexW3105849782WikidataQ105584237 ScholiaQ105584237MaRDI QIDQ2373576
Publication date: 12 July 2007
Published in: The Annals of Statistics (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/0708.0083
Nonparametric regression and quantile regression (62G08) Classification and discrimination; cluster analysis (statistical aspects) (62H30) Computational learning theory (68Q32) Learning and adaptive systems in artificial intelligence (68T05) Pattern recognition, speech recognition (68T10) Probability theory on algebraic and topological structures (60B99)
Related Items
Sampling and empirical risk minimization, Deep learning: a statistical viewpoint, Unnamed Item, Sample average approximation with heavier tails. I: Non-asymptotic bounds with weak assumptions and stochastic constraints, Sample average approximation with heavier tails II: localization in stochastic convex optimization and persistence results for the Lasso, Statistical inference using regularized M-estimation in the reproducing kernel Hilbert space for handling missing data, Wild bootstrap inference for penalized quantile regression for longitudinal data, Unnamed Item, Unnamed Item, Convergence rates for shallow neural networks learned by gradient descent, Robust supervised learning with coordinate gradient descent, Unnamed Item, HONEST CONFIDENCE SETS IN NONPARAMETRIC IV REGRESSION AND OTHER ILL-POSED MODELS, Concentration Inequalities for Samples without Replacement, U-Processes and Preference Learning, Unnamed Item, Unnamed Item, Variance-based regularization with convex objectives, Optimal survey schemes for stochastic gradient descent with applications to M-estimation, Complexity versus Agreement for Many Views, Unnamed Item, On least squares estimation under heteroscedastic and heavy-tailed errors, Tikhonov, Ivanov and Morozov regularization for support vector machine learning, Sparsity in penalized empirical risk minimization, Empirical variance minimization with applications in variance reduction and optimal control, Noisy discriminant analysis with boundary assumptions, Regularization in kernel learning, Statistical properties of kernel principal component analysis, Model selection by bootstrap penalization for classification, Classifiers of support vector machine type with \(\ell_1\) complexity regularization, Fast learning rates in statistical inference through aggregation, Optimal robust mean and location estimation via convex programs with respect to any pseudo-norms, Measuring distributional asymmetry with Wasserstein distance and Rademacher symmetrization, Fast learning rate of non-sparse multiple kernel learning and optimal regularization strategies, Fast rates for empirical vector quantization, Joint regression analysis of mixed-type outcome data via efficient scores, Complex sampling designs: uniform limit theorems and applications, Localization of VC classes: beyond local Rademacher complexities, Inverse statistical learning, The two-sample problem for Poisson processes: adaptive tests with a nonasymptotic wild bootstrap approach, Local Rademacher complexity: sharper risk bounds with and without unlabeled samples, Robust statistical learning with Lipschitz and convex loss functions, Compressive statistical learning with random feature moments, Convergence rates for empirical barycenters in metric spaces: curvature, convexity and extendable geodesics, A statistical view of clustering performance through the theory of \(U\)-processes, On the minimax optimality and superiority of deep neural network learning over sparse parameter spaces, Empirical risk minimization for heavy-tailed losses, Discussion of ``On concentration for (regularized) empirical risk minimization by Sara van de Geer and Martin Wainwright, Random design analysis of ridge regression, Adaptive estimation of a distribution function and its density in sup-norm loss by wavelet and spline projections, Statistical performance of support vector machines, Ranking and empirical minimization of \(U\)-statistics, Aggregation of estimators and stochastic optimization, On the optimality of the empirical risk minimization procedure for the convex aggregation problem, Global uniform risk bounds for wavelet deconvolution estimators, Rates of convergence in active learning, From Gauss to Kolmogorov: localized measures of complexity for ellipses, Gibbs posterior concentration rates under sub-exponential type losses, Nonasymptotic analysis of robust regression with modified Huber's loss, Sharper lower bounds on the performance of the empirical risk minimization algorithm, Empirical risk minimization is optimal for the convex aggregation problem, Risk bounds for CART classifiers under a margin condition, Optimal upper and lower bounds for the true and empirical excess risks in heteroscedastic least-squares regression, Oracle inequalities for cross-validation type procedures, Optimal model selection in heteroscedastic regression using piecewise polynomial functions, Concentration inequalities and confidence bands for needlet density estimators on compact homogeneous manifolds, General oracle inequalities for model selection, Model selection by resampling penalization, On the optimality of the aggregate with exponential weights for low temperatures, ERM and RERM are optimal estimators for regression problems when malicious outliers corrupt the labels, General nonexact oracle inequalities for classes with a subexponential envelope, Margin-adaptive model selection in statistical learning, Adaptive kernel methods using the balancing principle, Relative deviation learning bounds and generalization with unbounded loss functions, Nonparametric regression using deep neural networks with ReLU activation function, Parametric or nonparametric? A parametricness index for model selection, Minimax adaptive dimension reduction for regression, Obtaining fast error rates in nonconvex situations, Performance guarantees for policy learning, Bayesian fractional posteriors, Aggregation for Gaussian regression, Simultaneous adaptation to the margin and to complexity in classification, Optimal exponential bounds on the accuracy of classification, Concentration inequalities for two-sample rank processes with application to bipartite ranking, A new method for estimation and model selection: \(\rho\)-estimation, Rho-estimators revisited: general theory and applications, Robust multicategory support vector machines using difference convex algorithm, Singularity, misspecification and the convergence rate of EM, Surrogate losses in passive and active learning, Empirical minimization, Concentration inequalities and asymptotic results for ratio type empirical processes, A high-dimensional Wilks phenomenon, Mass volume curves and anomaly ranking, A local Vapnik-Chervonenkis complexity, Fast learning rates for plug-in classifiers, A universal procedure for aggregating estimators, Bandwidth selection in kernel empirical risk minimization via the gradient, Tests and estimation strategies associated to some loss functions, Direct importance estimation for covariate shift adaptation, Theory of Classification: a Survey of Some Recent Advances, On the Optimality of Sample-Based Estimates of the Expectation of the Empirical Minimizer, FAST RATES FOR ESTIMATION ERROR AND ORACLE INEQUALITIES FOR MODEL SELECTION, Set structured global empirical risk minimizers are rate optimal in general dimensions, Fast generalization error bound of deep learning without scale invariance of activation functions, Sparse recovery in convex hulls via entropy penalization, Estimation bounds and sharp oracle inequalities of regularized procedures with Lipschitz loss functions, Convergence rates of least squares regression estimators with heavy-tailed errors, Approximation properties of certain operator-induced norms on Hilbert spaces, Inference on covariance operators via concentration inequalities: \(k\)-sample tests, classification, and clustering via Rademacher complexities, Multiplier \(U\)-processes: sharp bounds and applications, Optimal linear discriminators for the discrete choice model in growing dimensions, Localized Gaussian width of \(M\)-convex hulls with applications to Lasso and convex aggregation, Rademacher complexity for Markov chains: applications to kernel smoothing and Metropolis-Hastings, Nonparametric estimation of low rank matrix valued function, An elementary analysis of ridge regression with random design, Square root penalty: Adaption to the margin in classification and in edge estimation, Nonasymptotic bounds for vector quantization in Hilbert spaces, Complexities of convex combinations and bounding the generalization error in classification, Minimax fast rates for discriminant analysis with errors in variables, Suboptimality of constrained least squares and improvements via non-linear predictors, A no-free-lunch theorem for multitask learning
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Some limit theorems for empirical processes (with discussion)
- Oracle inequalities and nonparametric function estimation
- Risk bounds for model selection via penalization
- Sharper bounds for Gaussian and empirical processes
- Convergence rate of sieve estimates
- Inequalities for uniform deviations of averages from expectations with applications to nonparametric regression
- Smooth discrimination analysis
- A Bennett concentration inequality and its application to suprema of empirical processes
- Left concentration inequalities for empirical processes
- Moment inequalities for functions of independent random variables
- A distribution-free theory of nonparametric regression
- Empirical margin distributions and bounding the generalization error of combined classifiers
- Bounding the generalization error of convex combinations of classifiers: Balancing the dimensionality and the margins.
- Complexity regularization via localized random penalties
- On the Bayes-risk consistency of regularized boosting methods.
- Statistical behavior and consistency of classification methods based on convex risk minimization.
- Optimal aggregation of classifiers in statistical learning.
- Weak convergence and empirical processes. With applications to statistics
- A new look at independence
- Statistical performance of support vector machines
- Empirical minimization
- Concentration inequalities and asymptotic results for ratio type empirical processes
- Square root penalty: Adaption to the margin in classification and in edge estimation
- Complexities of convex combinations and bounding the generalization error in classification
- Local Rademacher complexities
- Model selection for regression on a random design
- Consistency of Support Vector Machines and Other Regularized Kernel Classifiers
- Uniform Central Limit Theorems
- Efficient agnostic learning of neural networks with bounded fan-in
- A sharp concentration inequality with applications
- Rademacher penalties and structural risk minimization
- Improving the sample complexity using global data
- 10.1162/1532443041424319
- Neural Network Learning
- Convexity, Classification, and Risk Bounds
- An empirical process approach to the uniform consistency of kernel-type function estimators
- Some applications of concentration inequalities to statistics
- On consistency of kernel density estimators for randomly censored data: Rates holding uniformly over adaptive intervals
- Model selection and error estimation
- New concentration inequalities in product spaces