Relative value iteration algorithm with soft state aggregation (Q2705757)

A straightforward way to dispel the curse of dimensionality in large stochastic control problems is to replace the lookup table with a generalized function approximator such as state aggregation. The relative value iteration algorithm for average reward Markov decision processes (MDP) with soft state aggregation is investigated. Under a condition of contraction involving a semi-norm, the convergence of the proposed algorithm is proved and an error bound of the approximation is also given.

0 references

reviewed by

Elena Ya. Gorelova

0 references

Identifiers

zbMATH Open document ID

0966.93110

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:2705757