Control randomisation approach for policy gradient and application to reinforcement learning in optimal switching
From MaRDI portal
Publication:6657507
DOI10.1007/s00245-024-10207-5MaRDI QIDQ6657507
Xavier Warin, Huyên Pham, Robert Denkert
Publication date: 6 January 2025
Published in: Applied Mathematics and Optimization (Search for Journal in Brave)
actor-critic algorithmsoptimal switchingpolicy gradientcontrol randomizationreinforcement learning in continuous time
Artificial intelligence (68Txx) Model systems in control theory (93Cxx) Controllability, observability, and system structure (93Bxx)
Cites Work
- Unnamed Item
- Representation of non-Markovian optimal stopping problems by constrained BSDEs with a single jump
- Valuation of power plants by utility indifference and numerical computation
- Backward SDEs with constrained jumps and quasi-variational inequalities
- Probabilistic representation and approximation for coupled systems of variational inequalities
- Feynman-Kac representation for Hamilton-Jacobi-Bellman IPDE
- Randomized and backward SDE representation for optimal control of non-Markovian SDEs
- Monte-Carlo Valuation of American Options: Facts and New Algorithms to Improve Existing Methods
- Valuation of energy storage: an optimal switching approach
- A stochastic target formulation for optimal switching problems in finite horizon
- Multivariate point processes: predictable projection, Radon-Nikodym derivatives, representation of martingales
- On the Starting and Stopping Problem: Application in Reversible Investments
- Reservoir optimization and Machine Learning methods
- Randomized Optimal Stopping Problem in Continuous time and Reinforcement Learning Algorithm
This page was built for publication: Control randomisation approach for policy gradient and application to reinforcement learning in optimal switching