A basic formula for online policy gradient algorithms
From MaRDI portal
Publication:5274061
DOI10.1109/TAC.2005.847037zbMath1365.90256OpenAlexW2096899507MaRDI QIDQ5274061
Publication date: 12 July 2017
Published in: IEEE Transactions on Automatic Control (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1109/tac.2005.847037
Sensitivity, stability, parametric optimization (90C31) Estimation and detection in stochastic control theory (93E10) Markov and semi-Markov decision processes (90C40)
Related Items (2)
A reinforcement-learning approach for admission control in distributed network service systems ⋮ On-line policy gradient estimation with multi-step sampling
This page was built for publication: A basic formula for online policy gradient algorithms