Convergence of Finite Memory Q Learning for POMDPs and Near Optimality of Learned Policies Under Filter Stability
From MaRDI portal
Publication:6122574
DOI10.1287/MOOR.2022.1331arXiv2103.12158OpenAlexW3136541527MaRDI QIDQ6122574
Ali Devran Kara, Serdar Yüksel
Publication date: 1 March 2024
Published in: Mathematics of Operations Research (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2103.12158
Filtering in stochastic control theory (93E11) Learning and adaptive systems in artificial intelligence (68T05) Markov and semi-Markov decision processes (90C40)
Related Items (1)
This page was built for publication: Convergence of Finite Memory Q Learning for POMDPs and Near Optimality of Learned Policies Under Filter Stability