Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
Convergence of Finite Memory Q Learning for POMDPs and Near Optimality of Learned Policies Under Filter Stability - MaRDI portal

Convergence of Finite Memory Q Learning for POMDPs and Near Optimality of Learned Policies Under Filter Stability

From MaRDI portal

Publication:6122574

Jump to:navigation, search

DOI10.1287/MOOR.2022.1331arXiv2103.12158OpenAlexW3136541527MaRDI QIDQ6122574

Ali Devran Kara, Serdar Yüksel

Publication date: 1 March 2024

Published in: Mathematics of Operations Research (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/2103.12158

zbMATH Keywords

reinforcement learning partially observed MDP reinforcement learning partially observed MDP

Mathematics Subject Classification ID

Filtering in stochastic control theory (93E11) Learning and adaptive systems in artificial intelligence (68T05) Markov and semi-Markov decision processes (90C40)

Related Items (1)

Formalization of methods for the development of autonomous artificial intelligence systems

This page was built for publication: Convergence of Finite Memory Q Learning for POMDPs and Near Optimality of Learned Policies Under Filter Stability

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:6122574&oldid=35580753"