Mathematical Research Data Initiative
Main page
Recent changes
Random page
Help about MediaWiki
Create a new Item
Create a new Property
Create a new EntitySchema
Merge two items
In other projects
Discussion
View source
View history
Purge
English
Log in

On the worst-case analysis of temporal-difference learning algorithms

From MaRDI portal
Publication:1911342
Jump to:navigation, search

DOI10.1007/BF00114725zbMath0843.68093OpenAlexW1976578332MaRDI QIDQ1911342

Manfred K. Warmuth, Robert E. Schapire

Publication date: 21 April 1996

Published in: Machine Learning (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1007/bf00114725


zbMATH Keywords

learning algorithmsSutton's method of temporal differences


Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05)


Related Items (3)

Chaotic dynamics and convergence analysis of temporal difference algorithms with bang-bang control ⋮ Scalable estimation strategies based on stochastic approximations: classical results and new insights ⋮ A Finite Time Analysis of Temporal Difference Learning with Linear Function Approximation



Cites Work

  • Unnamed Item
  • The convergence of \(TD(\lambda)\) for general \(\lambda\)
  • Matrix Analysis
  • On the Convergence of Stochastic Iterative Dynamic Programming Algorithms




This page was built for publication: On the worst-case analysis of temporal-difference learning algorithms

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:1911342&oldid=14330004"
Tools
What links here
Related changes
Special pages
Printable version
Permanent link
Page information
MaRDI portal item
This page was last edited on 1 February 2024, at 14:25.
Privacy policy
About MaRDI portal
Disclaimers
Imprint
Powered by MediaWiki