Least squares policy iteration with instrumental variables vs. direct policy search: comparison against optimal benchmarks using energy storage
From MaRDI portal
Publication:5882386
DOI10.1080/03155986.2019.1624491OpenAlexW2963530719WikidataQ114100489 ScholiaQ114100489MaRDI QIDQ5882386
Somayeh Moazeni, Warren B. Powell, Warren R. Scott
Publication date: 15 March 2023
Published in: INFOR: Information Systems and Operational Research (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1401.0843
dynamic programmingdirect policy searchapproximate dynamic programmingenergy storageapproximate policy iterationBellman error minimization
Related Items (1)
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Smoothing and parametric rules for stochastic mean-CVaR optimal execution strategy
- Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
- Instrumental variable methods for system identification
- Asynchronous stochastic approximation and Q-learning
- On the existence of fixed points for approximate value iteration and temporal-difference learning
- On the complexity of energy storage problems
- Optimal price-threshold control for battery operation with aging phenomenon: a quasiconvex optimization approach
- Basis function adaptation in temporal difference reinforcement learning
- A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications
- Dynamic-Programming Approximations for Stochastic Time-Staged Integer Multicommodity-Flow Problems
- Approximate Dynamic Programming
- Recursive Estimation and Time-Series Analysis
- An Approximate Dynamic Programming Approach to Benchmark Practice-Based Heuristics for Natural Gas Storage Valuation
- The Correlated Knowledge Gradient for Simulation Optimization of Continuous Parameters using Gaussian Process Regression
- Pricing in Electricity Markets: A Mean Reverting Jump Diffusion Model with Seasonality
- Algorithms for Reinforcement Learning
- The Linear Programming Approach to Approximate Dynamic Programming
- An analysis of temporal-difference learning with function approximation
- 10.1162/1532443041827907
- Least Squares Temporal Difference Methods: An Analysis under General Conditions
- Approximate dynamic programming via iterated Bellman inequalities
- An Optimal Approximate Dynamic Programming Algorithm for Concave, Scalar Storage Problems With Vector-Valued Controls
- Parallel Nonstationary Direct Policy Search for Risk-Averse Stochastic Optimization
- On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming
- Errors in Variables
This page was built for publication: Least squares policy iteration with instrumental variables vs. direct policy search: comparison against optimal benchmarks using energy storage