Estimation and control in discounted stochastic dynamic programming
From MaRDI portal
Publication:3758580
DOI10.1080/17442508708833435zbMath0621.90092OpenAlexW2081947709MaRDI QIDQ3758580
Publication date: 1987
Published in: Stochastics (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1080/17442508708833435
semi-Markov decision modelestimation and controldenumerable state spaceunbounded rewardsdiscounted return criterion
Queueing theory (aspects of probability theory) (60K25) Queues and service in operations research (90B22) Dynamic programming (90C39) Markov and semi-Markov decision processes (90C40)
Related Items (29)
Finite-state approximations for denumerable multidimensional state discounted Markov decision processes ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Adaptive policy-iteration and policy-value-iteration for discounted Markov decision processes ⋮ Adaptive policies for discrete-time stochastic control systems with unknown disturbance distribution ⋮ Density estimation and adaptive control of Markov processes: Average and discounted criteria ⋮ Nonparametric adaptive control of discrete-time partially observable stochastic systems ⋮ Discretization procedures for adaptive Markov control processes ⋮ Markov control models with unknown random state-action-dependent discount factors ⋮ Adaptive discounted control for piecewise deterministic Markov processes ⋮ Adaptive control of continuous-time linear stochastic systems with discounted cost criterion ⋮ Nonparametric adaptive control of discounted stochastic systems with compact state space ⋮ Adaptive control of constrained Markov chains: Criteria and policies ⋮ Nonparametric estimation and adaptive control in a class of finite Markov decision chains ⋮ Sensitivity of constrained Markov decision processes ⋮ Q-learning for Markov decision processes with a satisfiability criterion ⋮ Adaptive control of stochastic systems with unknown disturbance distribution: discounted criteria ⋮ Recursive adaptive control of Markov decision processes with the average reward criterion ⋮ Ergodic control of multidimensional diffusions. II: Adaptive control ⋮ The actor-critic algorithm as multi-time-scale stochastic approximation. ⋮ Analysis of an identification algorithm arising in the adaptive estimation of Markov chains ⋮ Adaptive control of diffusion processes with a discounted reward criterion ⋮ Two person zero-sum semi-Markov games with unknown holding times distribution on one side: A discounted payoff criterion ⋮ The Kumar-Becker-Lin scheme revisited ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Nonstationary value-iteration and adaptive control of discounted semi- Markov processes ⋮ Stability estimation of some Markov controlled processes ⋮ Identification and control in the partially known Merton portfolio selection model
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Denumerable state semi-Markov decision processes with unbounded costs, average cost criterion
- Markov decision processes and strongly excessive functions
- Adaptive control of Markov chains, I: Finite parameter set
- Strongly consistent estimation in a controlled Markov renewal model
- Bounds for the regret loss in dynamic programming under adaptive control
- On Dynamic Programming with Unbounded Rewards
- Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal
- A characterization of geometric ergodicity
- Estimation and control in Markov chains
This page was built for publication: Estimation and control in discounted stochastic dynamic programming