From perturbation analysis to Markov decision processes and reinforcement learning (Q1870309)

From MaRDI portal





scientific article; zbMATH DE number 1908596
Language Label Description Also known as
English
From perturbation analysis to Markov decision processes and reinforcement learning
scientific article; zbMATH DE number 1908596

    Statements

    From perturbation analysis to Markov decision processes and reinforcement learning (English)
    0 references
    0 references
    11 May 2003
    0 references
    There are various ways, such as perturbation analysis (PA), Markov decision processes (MDPs) and reinforcement learning (RL) etc., to achieve performance optimization of a dynamical system. Here, the author studies the relationships among these closely related fields. The author shows that performance potentials play a crucial role in PA, MDPs and other optimization approaches. RL, neuro-dynamic programming, etc. are sample-path-based efficient ways of estimating the performance potentials and \(Q\)-factors. It is pointed out here that the potential-based approach is practically important due to its on-line application to real systems, which is discussed through an example.
    0 references
    0 references
    on-line algorithms
    0 references
    Poisson equations
    0 references
    gradient-based policy iteration
    0 references
    perturbation analysis
    0 references
    Q-learning
    0 references
    TD(\(\lambda\))
    0 references
    Markov decision processes
    0 references
    reinforcement learning
    0 references
    performance potentials
    0 references

    Identifiers

    0 references
    0 references
    0 references
    0 references
    0 references
    0 references