Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
scientific article - MaRDI portal

scientific article

From MaRDI portal

Publication:2896031

Jump to:navigation, search

zbMath1242.68217MaRDI QIDQ2896031

Ron Meir, Dotan di Castro

Publication date: 13 July 2012

Full work available at URL: http://www.jmlr.org/papers/v11/dicastro10a.html

Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.

zbMATH Keywords

temporal difference actor critic single time scale convergence

Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05) Stopping times; optimal stopping problems; gambling theory (60G40) Online algorithms; streaming algorithms (68W27)

Related Items (3)

A Small Gain Analysis of Single Timescale Actor Critic ⋮ On the sample complexity of actor-critic method for reinforcement learning with function approximation ⋮ Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies

This page was built for publication:

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:2896031&oldid=15853275"