Monotone value iteration for discounted finite Markov decision processes (Q1076618)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Monotone value iteration for discounted finite Markov decision processes |
scientific article; zbMATH DE number 3954679
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Monotone value iteration for discounted finite Markov decision processes |
scientific article; zbMATH DE number 3954679 |
Statements
Monotone value iteration for discounted finite Markov decision processes (English)
0 references
1985
0 references
This paper considers the properties of two modifications of the value iteration scheme for a finite state, finite actions Markovian decision process with the total expected discounted reward criterion. The first variant replaces the discount factor appearing in the value iteration scheme by a magnitude which depends on the results of the previous iterations. In the second variant some perturbation is added to the value vector on each iteration. The author shows that for some choice of the parameters of these methods they converge monotonically to the optimal solutions. He also presents some examples in which these modifications perform better than the usual value iteration scheme.
0 references
value iteration
0 references
finite state, finite actions Markovian decision process
0 references
total expected discounted reward criterion
0 references