A Scalable Bootstrap for Massive Data

From MaRDI portal
Publication:5088236

DOI10.1111/rssb.12050OpenAlexW2146774335WikidataQ63854843 ScholiaQ63854843MaRDI QIDQ5088236

Ariel Kleiner, Ameet Talwalkar, Purnamrita Sarkar, Michael I. Jordan

Publication date: 11 July 2022

Published in: Journal of the Royal Statistical Society Series B: Statistical Methodology (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/1112.5016



Related Items

A review of distributed statistical inference, Discussion on ‘A review of distributed statistical inference’, An adaptive lack of fit test for big data, Optimal Distributed Subsampling for Maximum Quasi-Likelihood Estimators With Massive Data, Partitioned Approach for High-dimensional Confidence Intervals with Large Split Sizes, Nonparametric Bayesian Aggregation for Massive Data, Distributed nonparametric function estimation: optimal rate of convergence and cost of adaptation, A random forest guided tour, Making Recursive Bayesian Inference Accessible, The future of statistics and data science, Method G: Uncertainty Quantification for Distributed Data Problems Using Generalized Fiducial Inference, Online Updating of Survival Analysis, Unnamed Item, Distributed testing and estimation under sparse high dimensional models, Statistical inference in massive datasets by empirical likelihood, Feasible algorithm for linear mixed model for massive data, Communication-Efficient Distributed Linear Discriminant Analysis for Binary Classification, Composite quantile regression for massive datasets, Sufficiency Revisited: Rethinking Statistical Algorithms in the Big Data Era, Model Checking in Large-Scale Dataset via Structure-Adaptive-Sampling, Distributed inference for two‐sample U‐statistics in massive data analysis, Divide and conquer for accelerated failure time model with massive time‐to‐event data, Optimal subsampling for large‐sample quantile regression with massive data, Statistical Inference, Learning and Models in Big Data, Information-based optimal subdata selection for big data logistic regression, Online updating method to correct for measurement error in big data streams, Distributed penalized modal regression for massive data, CEDAR: Communication Efficient Distributed Analysis for Regressions, A model robust subsampling approach for generalised linear models in big data settings, Model aggregation for doubly divided data with large size and large dimension, New metrics and tests for subject prevalence in documents based on topic modeling, Generalized linear models for massive data via doubly-sketching, Confidence interval construction in massive data sets, Quantile varying-coefficient structural equation model, Distributed smoothed rank regression with heterogeneous errors for massive data, Optimal subsampling algorithms for composite quantile regression in massive data, Scaling by subsampling for big data, with applications to statistical learning, A distributed multiple sample testing for massive data, Communication-Efficient Accurate Statistical Estimation, Distributed estimation of functional linear regression with functional responses, A dynamic screening algorithm for hierarchical binary marketing data, Generalised likelihood profiles for models with intractable likelihoods, Optimal Subsampling Bootstrap for Massive Data, Imposing unsupervised constraints to the benefit-of-the-doubt (BoD) model, Distributed function estimation: adaptation using minimal communication, iFusion: Individualized Fusion Learning, Combining Multiple Observational Data Sources to Estimate Causal Effects, Adaptive distributed methods under communication constraints, Testing multivariate quantile by empirical likelihood, Bayesian bootstraps for massive data, A partitioned quasi-likelihood for distributed statistical inference, Principles of experimental design for big data analysis, Can we trust the bootstrap in high-dimension?, High-dimensional integrative analysis with homogeneity and sparsity recovery, Randomized incomplete \(U\)-statistics in high dimensions, Quantile regression under memory constraint, Computing confidence intervals from massive data via penalized quantile smoothing splines, Distributed simultaneous inference in generalized linear models via confidence distribution, A MOM-based ensemble method for robustness, subsampling and hyperparameter tuning, Scatter matrix concordance as a diagnostic for regressions on subsets of data, Distributed statistical inference for massive data, Scalable Bayesian Nonparametric Clustering and Classification, Communication-Efficient Distributed Statistical Inference, A scalable nonparametric specification testing for massive data, Robust and Scalable Bayes via a Median of Subset Posterior Measures, Multiple graph regularized graph transduction via greedy gradient Max-Cut, Projected spline estimation of the nonparametric function in high-dimensional partially linear models for massive data, A Massive Data Framework for M-Estimators with Cubic-Rate, Distributed adaptive Gaussian mean estimation with unknown variance: interactive protocol helps adaptation, Unnamed Item, Subdata selection algorithm for linear model discrimination, A split-and-conquer variable selection approach for high-dimensional general semiparametric models with massive data