Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
scientific article - MaRDI portal

scientific article

From MaRDI portal

Publication:2953645

Jump to:navigation, search

zbMath1404.68124arXiv1511.07471MaRDI QIDQ2953645

Publication date: 5 January 2017

Full work available at URL: https://arxiv.org/abs/1511.07471

Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.

zbMATH Keywords

convergence Markov decision processes importance sampling stochastic approximation reinforcement learning approximate policy evaluation temporal-difference methods

Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05)

Related Items

Multi-agent off-policy actor-critic algorithm for distributed multi-task reinforcement learning, Gradient temporal-difference learning for off-policy evaluation using emphatic weightings, Estimating Optimal Infinite Horizon Dynamic Treatment Regimes via pT-Learning, Distributed consensus-based multi-agent temporal-difference learning, On Generalized Bellman Equations and Temporal-Difference Learning, Two Time-Scale Stochastic Approximation with Controlled Markov Noise and Off-Policy Temporal-Difference Learning, Convergence of Recursive Stochastic Algorithms Using Wasserstein Divergence

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:2953645&oldid=15946680"