The following pages link to Online Markov Decision Processes (Q3169063):
Displaying 19 items.
- Adaptive aggregation for reinforcement learning in average reward Markov decision processes (Q378753) (← links)
- Online regret bounds for Markov decision processes with deterministic transitions (Q982638) (← links)
- Online spatio-temporal matching in stochastic and dynamic domains (Q1648078) (← links)
- Approachability in Stackelberg stochastic games with vector costs (Q1707454) (← links)
- Bayesian adversarial multi-node bandit for optimal smart grid protection against cyber attacks (Q2021298) (← links)
- Multi-period orienteering with uncertain adoption likelihood and waiting at customers (Q2282513) (← links)
- Poisoning finite-horizon Markov decision processes at design time (Q2668608) (← links)
- Policy mirror descent for reinforcement learning: linear convergence, new sampling complexity, and generalized problem classes (Q2687069) (← links)
- Reinforcement learning in robust Markov decision processes (Q2833106) (← links)
- Simple regret optimization in online planning for Markov decision processes (Q2921080) (← links)
- Chasing Ghosts: Competing with Stateful Policies (Q2968152) (← links)
- Markov Decision Processes with Arbitrary Reward Processes (Q3169064) (← links)
- Online Learning over a Finite Action Set with Limited Switching (Q4991672) (← links)
- (Q4999029) (← links)
- Temporal concatenation for Markov decision processes (Q5051192) (← links)
- (Q5053203) (← links)
- Fast Global Convergence of Natural Policy Gradient Methods with Entropy Regularization (Q5106383) (← links)
- An Online Policy Gradient Algorithm for Markov Decision Processes with Continuous States and Actions (Q5380403) (← links)
- Learning Stationary Nash Equilibrium Policies in \(n\)-Player Stochastic Games with Independent Chains (Q6150987) (← links)