Strong points of weak convergence: A study using RPA gradient estimation for automatic learning (Q1301439)

The paper deals with a stochastic learning system in which a control parameter is a probability vector. In particular, the paper deals with optimization problems of the type to ``optimize'' the function \(F(\theta)\), \(\theta \in {\mathcal R}^{d}\), that depends on a probability measure (maybe through the operator of mathematical expectation in the objective). However, in spite of a usually discussed case in which the ``underlying'' probability measure doesn't depend on a decision, the author considers the case when there exists a functional dependence. The applications of such types of problems can be found e.g. in telecommunication, intelligent transportation, flexible manufacturing systems etc. To investigate the problem the author assumes: The analytical form of \(F(\theta)\) is unknown, \(F(\theta)\) is a differentiable function with unknown gradient vector, it is possible (for every \(\theta\)) to obtain an unbiased statistical estimate of the gradient vector. The stochastic approximation approach is employed to construct the learning procedure. To estimate the gradient vector three methods are mentioned: two of them are based on the regenerative estimation approach and the third one, called the non-reset version of the estimator, corresponds to the approach of Kushner and Vázquez-Abad. The aim of the paper is to illustrate the behaviour of such a learning system. Some comparison between the behaviour of the proposed scheme and a regenerative one is introduced. It is shown that the weak convergence can improve the convergence rate. At the end of the paper some simulations are introduced.

0 references

zbMATH Keywords

estimate of the gradient vector

0 references

stochastic approximation

0 references

learning

0 references

regenerate estimation approach

0 references

non-reset version of the estimator

0 references