Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
Geometry of policy improvement - MaRDI portal

Geometry of policy improvement

From MaRDI portal

Publication:1689145

Jump to:navigation, search

DOI10.1007/978-3-319-68445-1_33zbMath1426.91076arXiv1704.01785OpenAlexW2606500941MaRDI QIDQ1689145

Johannes Rauh, Guido Montúfar

Publication date: 12 January 2018

Full work available at URL: https://arxiv.org/abs/1704.01785

zbMATH Keywords

reinforcement learning partially observable Markov decision process memoryless stochastic policy policy gradient theorem

Mathematics Subject Classification ID

Decision theory (91B06)

Related Items (1)

Algebraic optimization of sequential decision problems

This page was built for publication: Geometry of policy improvement

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:1689145&oldid=14003062"