Nearly Dimension-Independent Sparse Linear Bandit over Small Action Spaces via Best Subset Selection
From MaRDI portal
Publication:6153988
DOI10.1080/01621459.2022.2108816arXiv2009.02003OpenAlexW4287672183MaRDI QIDQ6153988
Zhaoran Wang, Run-Ze Li, Yining Wang, Ethan X. Fang, Yi Chen
Publication date: 19 March 2024
Published in: Journal of the American Statistical Association (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2009.02003
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Best subset selection via a modern optimization lens
- Q-learning with censored data
- Statistics for high-dimensional data. Methods, theory and applications.
- Targeted sequential design for targeted learning inference of the optimal treatment rule and its mean reward
- Iterative hard thresholding for compressed sensing
- Asymptotically efficient adaptive allocation rules
- Reinforcement learning with immediate rewards and linear hypotheses
- \({\mathcal Q}\)-learning
- Randomized allocation with nonparametric estimation for a multi-armed bandit problem with covariates
- Sparse learning via Boolean relaxations
- Simultaneous analysis of Lasso and Dantzig selector
- The Dantzig selector: statistical estimation when \(p\) is much larger than \(n\). (With discussions and rejoinder).
- Non-Stationary Stochastic Optimization
- Penalized Q-learning for dynamic treatment regimens
- Linearly Parameterized Bandits
- Estimating Individualized Treatment Rules Using Outcome Weighted Learning
- Optimal Dynamic Treatment Regimes
- 10.1162/153244303321897663
- Sparse Approximate Solutions to Linear Systems
- A Robust Method for Estimating Optimal Treatment Regimes
- Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using $\ell _{1}$-Constrained Quadratic Programming (Lasso)
- Algorithm Selection for Combinatorial Search Problems: A Survey
- Bandit Algorithms
- A linear response bandit problem
- Doubly robust learning for estimating individualized treatment with censored data
- Inference for non-regular parameters in optimal dynamic treatment regimes
- Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
- Sparsity regret bounds for individual sequences in online linear regression
- Demystifying Optimal Dynamic Treatment Regimes
- Statistical Inference for Online Decision Making: In a Contextual Bandit Setting
- Compressed sensing
- Finite-time analysis of the multiarmed bandit problem
- Randomized allocation with arm elimination in a bandit problem with covariates