Reward revision and the average reward Markov decision process (Q1097179)

From MaRDI portal





scientific article; zbMATH DE number 4033533
Language Label Description Also known as
English
Reward revision and the average reward Markov decision process
scientific article; zbMATH DE number 4033533

    Statements

    Reward revision and the average reward Markov decision process (English)
    0 references
    0 references
    1987
    0 references
    We integrate two numerical procedures for solving the average reward Markov decision process (MDP), standard successive approximations and modified policy iteration with reward revision. Reward revision is the process of revising the reward structure of a second, more computationally desirable MDP so as to produce, in the limit, an optimality equation having a fixed point identical to that associated with the original MDP. A numerical study indicates that for MDP's having a non-sparse structure with a small number of relatively large entries per row, the addition of reward revision can have significant computational benefits.
    0 references
    average reward Markov decision process
    0 references
    successive approximations
    0 references
    modified policy iteration
    0 references
    reward revision
    0 references
    0 references
    0 references

    Identifiers