Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
scientific article; zbMATH DE number 5037121 - MaRDI portal

scientific article; zbMATH DE number 5037121

From MaRDI portal

Publication:5477860

Jump to:navigation, search

DOI10.1023/A:1018008221616zbMath1099.90586MaRDI QIDQ5477860

Benjamin van Roy, John N. Tsitsiklis

Publication date: 29 June 2006

Published in: Machine Learning (Search for Journal in Brave)

Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.

zbMATH Keywords

features dynamic programming reinforcement learning curse of dimensionality function approximation neuro-dynamic programming Compact representation

Mathematics Subject Classification ID

Dynamic programming (90C39)

Related Items

Practical solution techniques for first-order MDPs, Restricted gradient-descent algorithm for value-function approximation in reinforcement learning, ONLINE CAPACITY PLANNING FOR REHABILITATION TREATMENTS: AN APPROXIMATE DYNAMIC PROGRAMMING APPROACH, Q-learning and policy iteration algorithms for stochastic shortest path problems, Dynamic programming and value-function approximation in sequential decision problems: error analysis and numerical results, Policy iteration based feedback control, Water reservoir control under economic, social and environmental constraints, Q-learning with censored data, Continuous state dynamic programming via nonexpansive approximation, Basis function adaptation in temporal difference reinforcement learning, Approximate dynamic programming with a fuzzy parameterization, Decomposition of large-scale stochastic optimal control problems, Stochastic approximation algorithms: overview and recent trends., Stochastic decomposition applied to large-scale hydro valleys management, Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path, A hybrid dynamic programming -- Tabu search approach for the long-term hydropower scheduling problem, Stochastic dynamic programming with factored representations

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:5477860&oldid=30026773"