Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
Convergence of discretization procedure in \(Q\)-learning - MaRDI portal

Convergence of discretization procedure in \(Q\)-learning (Q2725088)

From MaRDI portal

Jump to:navigation, search

This is the item page for this Wikibase entity, intended for internal use and editing purposes.

Please use this page instead for the normal view: Convergence of discretization procedure in \(Q\)-learning

scientific article; zbMATH DE number 1618764

Language	Label	Description	Also known as
English	Convergence of discretization procedure in \(Q\)-learning	scientific article; zbMATH DE number 1618764

Statements

scholarly article

0 references

0 references

0 references

0 references

publication date

21 April 2002

0 references

zbMATH Keywords

\(Q\)-learning

0 references

dynamic programming

0 references

discretization

0 references

MaRDI profile type

0 references

Recommended article

Convergence of a Q-learning Variant for Continuous States and Actions

Similarity Score

0.91444

Recommender Run

Recommender Run 3

0 references

On the convergence of reinforcement learning

Similarity Score

0.89569104

Recommender Run

Recommender Run 3

0 references

The convergence of value iteration in discounted Markov decision processes

Similarity Score

0.8802094

Recommender Run

Recommender Run 3

0 references

Q-learning and enhanced policy iteration in discounted dynamic programming

Similarity Score

0.8772295

Recommender Run

Recommender Run 3

0 references

Machine Learning: ECML 2004

Similarity Score

0.87379855

Recommender Run

Recommender Run 3

0 references

Reinforcement learning via approximation of the Q-function

Similarity Score

0.8730544

Recommender Run

Recommender Run 3

0 references

Boundedness of iterates in \(Q\)-learning

Similarity Score

0.8720821

Recommender Run

Recommender Run 3

0 references

Convergence results for single-step on-policy reinforcement-learning algorithms

Similarity Score

0.87133706

Recommender Run

Recommender Run 3

0 references

Convergence of discretization procedure in \(Q\)-learning (English)

0 references

Control Theory \& Applications

0 references

The authors show that under certain compactness and Lipschitz continuity assumptions, the optimal solution obtained with \(Q\)-learning converges almost surely to the optimal solution obtained with the continuous dynamic programming algorithm as the maximal discretization grids approach zero.

0 references

Identifiers

zbMATH Open document ID

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

zbMATH DE Number

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:2725088

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Item:Q2725088&oldid=41933332"