Learning parametric policies and transition probability models of Markov decision processes from data
From MaRDI portal
Publication:2220059
DOI10.1016/j.ejcon.2020.04.003zbMath1502.90190OpenAlexW3031673669MaRDI QIDQ2220059
Henghui Zhu, Ioannis Ch. Paschalidis, Ting-Ting Xu
Publication date: 21 January 2021
Published in: European Journal of Control (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.ejcon.2020.04.003
regularizationmaximum likelihood estimationMarkov decision processespolicy learninglearning transition dynamics
Cites Work
- Unnamed Item
- Nonparametric estimation of Markov transition functions
- Nonparametric Estimation of Conditional Distributions
- Robust Markov Decision Processes
- Learning Policies for Markov Decision Processes From Data
- Robust Control of Markov Decision Processes with Uncertain Transition Matrices
- Elements of Information Theory
- Robust Dynamic Programming
This page was built for publication: Learning parametric policies and transition probability models of Markov decision processes from data