scientific article; zbMATH DE number 7370555
From MaRDI portal
Publication:4998920
Ruszczyński, Andrzej, Umit Köse
Publication date: 9 July 2021
Full work available at URL: https://jmlr.csail.mit.edu/papers/v22/20-168.html
Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.
stochastic approximationreinforcement learningtemporal difference methodsdynamic risk measureslinear function approximation
Related Items
Discrete-time risk-aware optimal switching with non-adapted costs, An Integrated Transportation Distance between Kernels and Approximate Dynamic Risk Evaluation in Markov Systems, Conditionally Elicitable Dynamic Risk Measures for Deep Reinforcement Learning, Mini-Batch Risk Forms, Reinforcement learning with dynamic convex risk measures, Risk-averse autonomous systems: a brief history and recent developments from the perspective of optimal control
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Risk-averse dynamic programming for Markov decision processes
- Dynamic monetary risk measures for bounded discrete-time processes
- Théorèmes de convergence presque sure pour une classe d'algorithmes stochastiques à pas decroissant
- Mean value theorem for convex functions
- Risk-sensitive and minimax control of discrete-time, finite-state Markov decision processes
- Asynchronous stochastic approximation and Q-learning
- Optimal long term growth rate of expected utility of wealth
- From stochastic dominance to mean-risk models: Semideviations as risk measures
- Risk measurement and risk-averse control of partially observable discrete-time Markov systems
- Risk-averse model predictive control
- Risk sensitive control of finite state Markov chains in discrete time, with applications to portfolio management
- \({\mathcal Q}\)-learning
- The convergence of \(TD(\lambda)\) for general \(\lambda\)
- Mean, variance and probabilistic criteria in finite Markov decision processes: A review
- Time-consistent investment policies in Markovian markets: a case of mean-variance analysis
- Algorithmic aspects of mean-variance optimization in Markov decision processes
- Statistical estimation of composite risk functionals and risk optimization problems
- Coherent multiperiod risk adjusted values and Bellman's principle
- Dynamic coherent risk measures
- Markov decision processes with a new optimality criterion: Discrete time
- Coherent Measures of Risk
- Risk-Sensitive Markov Control Processes
- Computational Methods for Risk-Averse Undiscounted Transient Markov Models
- Markov Decision Problems Where Means Bound Variances
- COMPOSITION OF TIME-CONSISTENT DYNAMIC MONETARY RISK MEASURES IN DISCRETE TIME
- Approximate Dynamic Programming
- A Learning Algorithm for Risk-Sensitive Cost
- Stochastic approximation method with gradient averaging for unconstrained problems
- Discounted MDP’s: Distribution Functions and Exponential Utility Maximization
- Variance-Penalized Markov Decision Processes
- Mean value theorems in nonsmooth analysis
- On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
- An analysis of temporal-difference learning with function approximation
- Risk-Constrained Reinforcement Learning with Percentile Risk Criteria
- Approximate Value Iteration for Risk-Aware Markov Decision Processes
- Risk-Sensitive Control of Discrete-Time Markov Processes with Infinite Horizon
- The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
- Persistently Optimal Policies in Stochastic Dynamic Programming with Generalized Discounting
- More Risk-Sensitive Markov Decision Processes
- Risk-Averse Access Point Selection in Wireless Communication Networks
- Risk-Averse Control of Undiscounted Transient Markov Models
- Robust Control of Markov Decision Processes with Uncertain Transition Matrices
- Sequential Decision Making With Coherent Risk
- Optimization of Convex Risk Functions
- Conditional Risk Mappings
- Risk-Sensitive Markov Decision Processes
- COHERENT ACCEPTABILITY MEASURES IN MULTIPERIOD MODELS
- Q-Learning for Risk-Sensitive Control
- Robust Dynamic Programming
- Polynomial Approximation--A New Computational Technique in Dynamic Programming: Allocation Processes
- A sensitivity formula for risk-sensitive cost and the actor-critic algorithm