Markov decision processes (Q5894023)

From MaRDI portal
scientific article; zbMATH DE number 5834332
Language Label Description Also known as
English
Markov decision processes
scientific article; zbMATH DE number 5834332

    Statements

    Markov decision processes (English)
    0 references
    0 references
    0 references
    10 January 2011
    0 references
    The authors consider the full observable Markov decision process with finite and infinite time horizon in this paper. For the finite time horizon case, the solution of the Markov decision problem can be obtained via solving a Bellman equation under the integrability and structure assumptions. Sufficient conditions for the two assumptions are discussed and the two applications of the card game and stochastic linear quadratic control problems are presented to illustrate their proposed solution method. For the infinite time horizon case, the authors show that both its reward value and optimal policy can be approximated by a sequence of reward values and optimal policies of those cases with finite time horizon under certain conditions. The bandit and dividend pay-out problems are presented to illustrate the proposed theory in this section.
    0 references
    0 references
    Markov decision process
    0 references
    Markov chain
    0 references
    Bellman equation
    0 references
    policy improvement
    0 references
    linear programming
    0 references

    Identifiers