Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
Error bounds for constant step-size \(Q\)-learning - MaRDI portal

Error bounds for constant step-size \(Q\)-learning

From MaRDI portal

Publication:1932736

Jump to:navigation, search

DOI10.1016/j.sysconle.2012.08.014zbMath1255.93129OpenAlexW1999254175MaRDI QIDQ1932736

Publication date: 21 January 2013

Published in: Systems \& Control Letters (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1016/j.sysconle.2012.08.014

zbMATH Keywords

Markov decision processes stochastic approximation \(Q\)-learning

Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05) Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.) (60J20) Stochastic systems in control theory (general) (93E03)

Related Items

Some Limit Properties of Markov Chains Induced by Recursive Stochastic Algorithms, A Discrete-Time Switching System Analysis of Q-Learning, Recent advances in reinforcement learning in finance, Settling the sample complexity of model-based offline reinforcement learning, Q-learning for continuous-time linear systems: A model-free infinite horizon optimal control approach, Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning, Convergence of Recursive Stochastic Algorithms Using Wasserstein Divergence

Uses Software

Approxrl

Cites Work

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:1932736&oldid=14364449"