Reward revision and the average reward Markov decision process (Q1097179)

scientific article; zbMATH DE number 4033533

Language	Label	Description	Also known as
English	Reward revision and the average reward Markov decision process	scientific article; zbMATH DE number 4033533

Statements

instance of

scholarly article

0 references

title

Reward revision and the average reward Markov decision process (English)

0 references

published in

OR Spektrum

0 references

publication date

1987

0 references

review text

We integrate two numerical procedures for solving the average reward Markov decision process (MDP), standard successive approximations and modified policy iteration with reward revision. Reward revision is the process of revising the reward structure of a second, more computationally desirable MDP so as to produce, in the limit, an optimality equation having a fixed point identical to that associated with the original MDP. A numerical study indicates that for MDP's having a non-sparse structure with a small number of relatively large entries per row, the addition of reward revision can have significant computational benefits.

0 references

zbMATH Keywords

average reward Markov decision process

0 references

successive approximations

0 references

modified policy iteration

0 references

reward revision

0 references

author

Chelsea C. White