Learning reward machines: a study in partially observable reinforcement learning
From MaRDI portal
Publication:6080653
DOI10.1016/j.artint.2023.103989arXiv2112.09477OpenAlexW4385462630MaRDI QIDQ6080653
Rodrigo Toro Icarte, Toryn Qwyllyn Klassen, Sheila A. McIlraith, Margarita P. Castro, Richard Valenzano, Ethan Waldie
Publication date: 4 October 2023
Published in: Artificial Intelligence (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2112.09477
reinforcement learningpartial observabilitynon-Markovian environmentsabstractionsautomata learningreward machines
Cites Work
- Learning regular sets from queries and counterexamples
- \({\mathcal Q}\)-learning
- Challenges of real-world reinforcement learning: definitions, benchmarks and analysis
- Learning Moore machines from input-output traces
- Convex Optimization: Algorithms and Complexity
- Complexity of automaton identification from given data
- Reward Machines: Exploiting Reward Function Structure in Reinforcement Learning
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
This page was built for publication: Learning reward machines: a study in partially observable reinforcement learning