Statistically Efficient Advantage Learning for Offline Reinforcement Learning in Infinite Horizons
From MaRDI portal
Publication:6153987
DOI10.1080/01621459.2022.2106868arXiv2202.13163WikidataQ114898043 ScholiaQ114898043MaRDI QIDQ6153987
Chengchun Shi, Rui Song, Yuan le, Shikai Luo, Hong-Tu Zhu
Publication date: 19 March 2024
Published in: Journal of the American Statistical Association (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2202.13163
rate of convergencereinforcement learningadvantage learninginfinite horizonsmobile health applications
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Doubly Robust Estimation in Missing Data and Causal Inference Models
- \(Q\)- and \(A\)-learning methods for estimating optimal dynamic treatment regimes
- Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy
- Performance guarantees for individualized treatment rules
- Basic properties of strong mixing conditions. A survey and some open questions
- Fast learning rates for plug-in classifiers
- High-dimensional \(A\)-learning for optimal dynamic treatment regimes
- \({\mathcal Q}\)-learning
- Optimal global rates of convergence for nonparametric regression
- Optimal aggregation of classifiers in statistical learning.
- Doubly-robust dynamic treatment regimen estimation via weighted least squares
- Penalized Q-learning for dynamic treatment regimens
- Interpretable Dynamic Treatment Regimes
- Quantile-Optimal Treatment Regimes
- Constructing dynamic treatment regimes over indefinite time horizons
- Optimal Dynamic Treatment Regimes
- Maximin Projection Learning for Optimal Treatment Decision with Heterogeneous Individualized Treatment Effects
- Mathematical Foundations of Infinite-Dimensional Statistical Models
- A Simple Method for Estimating Interactions Between a Treatment and a Large Number of Covariates
- Double/debiased machine learning for treatment and structural parameters
- Multi-Armed Angle-Based Direct Learning for Estimating Optimal Individualized Treatment Rules With Various Outcomes
- Estimating Dynamic Treatment Regimes in Mobile Health Using V-Learning
- Inference for non-regular parameters in optimal dynamic treatment regimes
- Greedy outcome weighted tree learning of optimal personalized treatment rules
- Optimal Structural Nested Models for Optimal Sequential Decisions
- New Statistical Learning Methods for Estimating Optimal Dynamic Treatment Regimes
- Robust estimation of optimal dynamic treatment regimes for sequential treatment decisions
- Learning When-to-Treat Policies
- Personalized Policy Learning Using Longitudinal Mobile Health Data