Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
TD-regularized actor-critic methods - MaRDI portal

TD-regularized actor-critic methods

From MaRDI portal

Publication:2320580

Jump to:navigation, search

DOI10.1007/s10994-019-05788-0zbMath1493.68313arXiv1812.08288OpenAlexW3104595455WikidataQ128334424 ScholiaQ128334424MaRDI QIDQ2320580

Mohammad Emtiyaz Khan, Jan Peters, Voot Tangkaratt, Simone Parisi

Publication date: 23 August 2019

Published in: Machine Learning (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/1812.08288

zbMATH Keywords

reinforcement learning actor-critic temporal difference

Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05) Markov and semi-Markov decision processes (90C40) Problem solving in the context of artificial intelligence (heuristics, search strategies, etc.) (68T20)

Related Items (3)

Optimistic reinforcement learning by forward Kullback-Leibler divergence optimization ⋮ On the sample complexity of actor-critic method for reinforcement learning with function approximation ⋮ td-reg

Uses Software

Cites Work

This page was built for publication: TD-regularized actor-critic methods

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:2320580&oldid=14910422"