On the existence of relative values for undiscounted multichain Markov decision processes (Q2265959)

From MaRDI portal
scientific article
Language Label Description Also known as
English
On the existence of relative values for undiscounted multichain Markov decision processes
scientific article

    Statements

    On the existence of relative values for undiscounted multichain Markov decision processes (English)
    0 references
    0 references
    1984
    0 references
    The value equation in undiscounted multichain finite state and action space Markov decision processes is transformed into the form \(v=\max \{w(f)+B(f)v\); \(f\in S\}\equiv Qv\) where S is the set of all maximal-gain policies, w(f) is the bias vector associated with policy f, B(f) is the equilibrium transition probability matrix, and v is the value vector. A simple proof then establishes that the operator Q possesses a fixed point, which implies that a value vector exists.
    0 references
    value equation
    0 references
    undiscounted multichain finite state and action space Markov decision processes
    0 references
    fixed point
    0 references

    Identifiers