Pseudometrics for State Aggregation in Average Reward Markov Decision Processes
From MaRDI portal
Publication:3520073
DOI10.1007/978-3-540-75225-7_30zbMath1142.68403OpenAlexW1509780496MaRDI QIDQ3520073
Publication date: 19 August 2008
Published in: Lecture Notes in Computer Science (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/978-3-540-75225-7_30
Computational learning theory (68Q32) Learning and adaptive systems in artificial intelligence (68T05) Markov and semi-Markov decision processes (90C40)
Related Items (5)
Extreme state aggregation beyond Markov decision processes ⋮ Adaptive aggregation for reinforcement learning in average reward Markov decision processes ⋮ Online Regret Bounds for Markov Decision Processes with Deterministic Transitions ⋮ Regret bounds for restless Markov bandits ⋮ A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Equivalence notions and model minimization in Markov decision processes
- A new analysis of quasianalysis
- Linear dependence of stationary distributions in ergodic Markov decision processes
- Mixing times with applications to perturbed Markov chains
- Bisimulation Metrics for Continuous Markov Decision Processes
- Learning Theory and Kernel Machines
- Comparison of perturbation bounds for the stationary distribution of a Markov chain
This page was built for publication: Pseudometrics for State Aggregation in Average Reward Markov Decision Processes