Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies - MaRDI portal

Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies (Q5139670)

From MaRDI portal

Jump to:navigation, search

scientific article; zbMATH DE number 7283567

Language	Label	Description	Also known as
English	Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies	scientific article; zbMATH DE number 7283567

Statements

scholarly article

0 references

Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies (English)

0 references

0 references

0 references

0 references

0 references

SIAM Journal on Control and Optimization

0 references

publication date

10 December 2020

0 references

full work available at URL

https://arxiv.org/abs/1906.08383

0 references

zbMATH Keywords

reinforcement learning

0 references

policy gradient methods

0 references

nonconvex optimization

0 references

global convergence

0 references

describes a project that uses

0 references

MaRDI profile type

MaRDI publication profile

0 references

0 references

0 references

0 references

0 references

0 references

An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes

0 references

Natural actor-critic algorithms

0 references

Stochastic approximation. A dynamical systems viewpoint.

0 references

0 references

Risk-Constrained Reinforcement Learning with Percentile Risk Criteria

0 references

On the convergence properties of non-Euclidean extragradient methods for variational inequalities with generalized monotone operators

0 references

0 references

Actor-Critic--Type Learning Algorithms for Markov Decision Processes

0 references

OnActor-Critic Algorithms

0 references

Introductory lectures on convex optimization. A basic course.

0 references

Cubic regularization of Newton method and its global performance

0 references

Nonconvergence to unstable points in urn models and stochastic approximations

0 references

Policy gradient in Lipschitz Markov decision processes

0 references

Lectures on Stochastic Programming

0 references

0 references

Simple statistical gradient-following algorithms for connectionist reinforcement learning

0 references

Numerical Optimization

0 references

Newton-type methods for non-convex optimization under inexact Hessian information

0 references

0 references

Infinite Time Horizon Maximum Causal Entropy Inverse Reinforcement Learning

0 references

On the Convergence of Mirror Descent beyond Stochastic Convex Programming

0 references

Identifiers

zbMATH Open document ID

0 references

10.1137/19M1288012

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

zbMATH DE Number

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:5139670

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Item:Q5139670&oldid=37055483"