Bounds for the regret loss in dynamic programming under adaptive control
From MaRDI portal
Publication:3968777
DOI10.1007/BF01916897zbMath0502.90085OpenAlexW2006974531MaRDI QIDQ3968777
Publication date: 1983
Published in: Zeitschrift für Operations Research (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/bf01916897
Related Items (7)
Continuous dependence of stochastic control models on the noise distribution ⋮ On truncations and perturbations of Markov decision problems with an application to queueing network overflow control ⋮ Nonparametric adaptive control of discrete-time partially observable stochastic systems ⋮ Estimation and control in discounted stochastic dynamic programming ⋮ Generalized Lipschitz-continuity of integrals with respect to a parameter of the intergrating probability measure ⋮ First-order sensitivity of the optimal value in a Markov decision model with respect to deviations in the transition probability function ⋮ Adaptive control of Markov processes with incomplete state information and unknown parameters
Cites Work
- Unnamed Item
- Unnamed Item
- Stochastic optimal control. The discrete time case
- Adaptive control of Markov chains, I: Finite parameter set
- The average-optimal adaptive control of a Markov renewal model in presence of an unknown parameter
- Strong consistency of a modified maximum likelihood estimator for controlled Markov chains
- Strongly consistent estimation in a controlled Markov renewal model
- Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal
- Estimation and control in Markov chains
This page was built for publication: Bounds for the regret loss in dynamic programming under adaptive control