A learning algorithm for discrete-time stochastic control (Q2711577)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: A learning algorithm for discrete-time stochastic control |
scientific article
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | A learning algorithm for discrete-time stochastic control |
scientific article |
Statements
2 February 2004
0 references
nonlinear control
0 references
\(Q\)-learning algorithm
0 references
compact state space
0 references
compact action space
0 references
simulation based algorithm
0 references
learning
0 references
discrete-time stochastic control
0 references
almost sure convergence
0 references
A learning algorithm for discrete-time stochastic control (English)
0 references
A simulation based algorithm for learning ``good'' policies for a discrete-time stochastic control process with unknown transition law is treated with the state and action spaces both being compact subsets of Euclidean spaces. Under suitable conditions almost sure convergence is proved. The paper is in the spirit of \textit{W. L. Baker} (PhD. Thesis, Harvard University 1997), but it analyzes the full nonlinear case and is in the tradition of the ordinary differential equation approach.
0 references