A multiagent reinforcement learning framework for off-policy evaluation in two-sided markets
From MaRDI portal
Publication:6138596
DOI10.1214/22-aoas1700arXiv2202.10574MaRDI QIDQ6138596
Hong-Tu Zhu, Shikai Luo, Chengchun Shi, Ge Song, Runzhe Wan, Rui Song
Publication date: 16 January 2024
Published in: The Annals of Applied Statistics (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2202.10574
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Doubly robust policy evaluation and optimization
- Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy
- Performance guarantees for individualized treatment rules
- Basic properties of strong mixing conditions. A survey and some open questions
- High-dimensional \(A\)-learning for optimal dynamic treatment regimes
- Bayesian method for causal inference in spatially-correlated multivariate time series
- Multi-agent reinforcement learning: a selective overview of theories and algorithms
- Batch policy learning in average reward Markov decision processes
- The stratified micro-randomized trial design: sample size considerations for testing nested causal effects of time-varying treatments
- Using decision lists to construct interpretable and parsimonious treatment regimes
- Evaluating marker-guided treatment selection strategies
- Penalized Q-learning for dynamic treatment regimens
- Interpretable Dynamic Treatment Regimes
- Toward Causal Inference With Interference
- Estimation and Inference of Heterogeneous Treatment Effects using Random Forests
- Quantile-Optimal Treatment Regimes
- Program Evaluation and Causal Inference With High-Dimensional Data
- Estimating Individualized Treatment Rules Using Outcome Weighted Learning
- Optimal Dynamic Treatment Regimes
- Exactp-Values for Network Interference
- A Robust Method for Estimating Optimal Treatment Regimes
- Maximin Projection Learning for Optimal Treatment Decision with Heterogeneous Individualized Treatment Effects
- Learning Optimal Distributionally Robust Individualized Treatment Rules
- Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning
- Estimating Dynamic Treatment Regimes in Mobile Health Using V-Learning
- Time Series Experiments and Causal Estimands: Exact Randomization Tests and Trading
- Causal Inference for Statistics, Social, and Biomedical Sciences
- Inference for non-regular parameters in optimal dynamic treatment regimes
- Greedy outcome weighted tree learning of optimal personalized treatment rules
- Optimal Structural Nested Models for Optimal Sequential Decisions
- New Statistical Learning Methods for Estimating Optimal Dynamic Treatment Regimes
- Robust estimation of optimal dynamic treatment regimes for sequential treatment decisions
- Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score
- Off-Policy Estimation of Long-Term Average Outcomes With Applications to Mobile Health
- Personalized Policy Learning Using Longitudinal Mobile Health Data
- Resampling‐based confidence intervals for model‐free robust inference on optimal treatment regimes