On the existence of relative values for undiscounted multichain Markov decision processes (Q2265959)
From MaRDI portal
scientific article
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | On the existence of relative values for undiscounted multichain Markov decision processes |
scientific article |
Statements
On the existence of relative values for undiscounted multichain Markov decision processes (English)
0 references
1984
0 references
The value equation in undiscounted multichain finite state and action space Markov decision processes is transformed into the form \(v=\max \{w(f)+B(f)v\); \(f\in S\}\equiv Qv\) where S is the set of all maximal-gain policies, w(f) is the bias vector associated with policy f, B(f) is the equilibrium transition probability matrix, and v is the value vector. A simple proof then establishes that the operator Q possesses a fixed point, which implies that a value vector exists.
0 references
value equation
0 references
undiscounted multichain finite state and action space Markov decision processes
0 references
fixed point
0 references