Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
New Versions of Gradient Temporal-Difference Learning - MaRDI portal

New Versions of Gradient Temporal-Difference Learning

From MaRDI portal
Publication:6093230

DOI10.1109/TAC.2022.3213763arXiv2109.04033MaRDI QIDQ6093230

Author name not available (Why is that?)

Publication date: 6 October 2023

Published in: IEEE Transactions on Automatic Control (Search for Journal in Brave)

Abstract: Sutton, Szepesv'{a}ri and Maei introduced the first gradient temporal-difference (GTD) learning algorithms compatible with both linear function approximation and off-policy training. The goal of this paper is (a) to propose some variants of GTDs with extensive comparative analysis and (b) to establish new theoretical analysis frameworks for the GTDs. These variants are based on convex-concave saddle-point interpretations of GTDs, which effectively unify all the GTDs into a single framework, and provide simple stability analysis based on recent results on primal-dual gradient dynamics. Finally, numerical comparative analysis is given to evaluate these approaches.


Full work available at URL: https://arxiv.org/abs/2109.04033







Recommendations





This page was built for publication: New Versions of Gradient Temporal-Difference Learning

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6093230)