Strong points of weak convergence: A study using RPA gradient estimation for automatic learning (Q1301439)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Strong points of weak convergence: A study using RPA gradient estimation for automatic learning |
scientific article; zbMATH DE number 1331899
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Strong points of weak convergence: A study using RPA gradient estimation for automatic learning |
scientific article; zbMATH DE number 1331899 |
Statements
Strong points of weak convergence: A study using RPA gradient estimation for automatic learning (English)
0 references
2 November 1999
0 references
The paper deals with a stochastic learning system in which a control parameter is a probability vector. In particular, the paper deals with optimization problems of the type to ``optimize'' the function \(F(\theta)\), \(\theta \in {\mathcal R}^{d}\), that depends on a probability measure (maybe through the operator of mathematical expectation in the objective). However, in spite of a usually discussed case in which the ``underlying'' probability measure doesn't depend on a decision, the author considers the case when there exists a functional dependence. The applications of such types of problems can be found e.g. in telecommunication, intelligent transportation, flexible manufacturing systems etc. To investigate the problem the author assumes: The analytical form of \(F(\theta)\) is unknown, \(F(\theta)\) is a differentiable function with unknown gradient vector, it is possible (for every \(\theta\)) to obtain an unbiased statistical estimate of the gradient vector. The stochastic approximation approach is employed to construct the learning procedure. To estimate the gradient vector three methods are mentioned: two of them are based on the regenerative estimation approach and the third one, called the non-reset version of the estimator, corresponds to the approach of Kushner and Vázquez-Abad. The aim of the paper is to illustrate the behaviour of such a learning system. Some comparison between the behaviour of the proposed scheme and a regenerative one is introduced. It is shown that the weak convergence can improve the convergence rate. At the end of the paper some simulations are introduced.
0 references
estimate of the gradient vector
0 references
stochastic approximation
0 references
learning
0 references
regenerate estimation approach
0 references
non-reset version of the estimator
0 references
weak convergence
0 references