TD-regularized actor-critic methods
From MaRDI portal
Publication:2320580
DOI10.1007/s10994-019-05788-0zbMath1493.68313arXiv1812.08288OpenAlexW3104595455WikidataQ128334424 ScholiaQ128334424MaRDI QIDQ2320580
Mohammad Emtiyaz Khan, Jan Peters, Voot Tangkaratt, Simone Parisi
Publication date: 23 August 2019
Published in: Machine Learning (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1812.08288
Learning and adaptive systems in artificial intelligence (68T05) Markov and semi-Markov decision processes (90C40) Problem solving in the context of artificial intelligence (heuristics, search strategies, etc.) (68T20)
Related Items (3)
Optimistic reinforcement learning by forward Kullback-Leibler divergence optimization ⋮ On the sample complexity of actor-critic method for reinforcement learning with function approximation ⋮ td-reg
Uses Software
Cites Work
This page was built for publication: TD-regularized actor-critic methods