Value iteration and rolling plans for Markov control processes with unbounded rewards (Q1260895)

The purpose is to extend known results for discounted Markov decision processes to the convergence of the value-iteration and the existence of error bounds for rolling horizon procedures to the case of a general state space and unbounded rewards. Now the error bounds are pointwise [w.r.t. the initial states] in contrast to the known uniform bounds. Uniformness is then obtained by using weighted norms. Further, under a strong ergodicity condition the bounds can be improved. The condition assumes a positive measure as lower bound for the distributions of states.

0 references

zbMATH Keywords

discounted Markov decision processes

0 references

convergence of the value-iteration

0 references

strong ergodicity condition

0 references

MaRDI profile type

Publication

0 references

full work available at URL