Mathematical Research Data Initiative
Main page
Recent changes
Random page
Help about MediaWiki
Create a new Item
Create a new Property
Create a new EntitySchema
Merge two items
In other projects
Discussion
View source
View history
Purge
English
Log in

Estimation and approximation bounds for gradient-based reinforcement learning

From MaRDI portal
Publication:1604222
Jump to:navigation, search

DOI10.1006/jcss.2001.1793zbMath1052.68108OpenAlexW1983016559MaRDI QIDQ1604222

Jonathan Baxter, Bartlett, Peter L.

Publication date: 4 July 2002

Published in: Journal of Computer and System Sciences (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1006/jcss.2001.1793


zbMATH Keywords

Partially Observable Markov Decision Process


Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05) Problem solving in the context of artificial intelligence (heuristics, search strategies, etc.) (68T20)


Related Items (1)

Exploiting random walks for learning



Cites Work

  • Unnamed Item
  • Unnamed Item
  • Learning dynamical systems in a stationary environment
  • Nonparametric time series prediction through adaptive model selection
  • Simple statistical gradient-following algorithms for connectionist reinforcement learning
  • Minimum complexity regression estimation with weakly dependent observations
  • OnActor-Critic Algorithms
  • Sensitivity Analysis for Simulations via Likelihood Ratios
  • Neural Network Learning
  • Probability Inequalities for Sums of Bounded Random Variables


This page was built for publication: Estimation and approximation bounds for gradient-based reinforcement learning

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:1604222&oldid=13901324"
Tools
What links here
Related changes
Special pages
Printable version
Permanent link
Page information
MaRDI portal item
This page was last edited on 1 February 2024, at 03:44.
Privacy policy
About MaRDI portal
Disclaimers
Imprint
Powered by MediaWiki