Adaptive policy-iteration and policy-value-iteration for discounted Markov decision processes (Q3984139)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Adaptive policy-iteration and policy-value-iteration for discounted Markov decision processes |
scientific article; zbMATH DE number 25720
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Adaptive policy-iteration and policy-value-iteration for discounted Markov decision processes |
scientific article; zbMATH DE number 25720 |
Statements
Adaptive policy-iteration and policy-value-iteration for discounted Markov decision processes (English)
0 references
27 June 1992
0 references
discounted Markov decision process
0 references
nonstationary value iteration
0 references
policy-iteration
0 references
policy-value-iteration
0 references
asymptotically discount optimal policies
0 references
0.91944665
0 references
0.9174795
0 references
0.9045119
0 references
0.90358114
0 references
0.90247166
0 references