Policy learning for time-bounded reachability in continuous-time Markov decision processes via doubly-stochastic gradient ascent
From MaRDI portal
Publication:1693106
DOI10.1007/978-3-319-43425-4_17zbMath1380.65024arXiv1605.09703OpenAlexW2417191198MaRDI QIDQ1693106
Luca Bortolussi, Guido Sanguinetti, Ezio Bartocci, Dimitrios Milios, Tomáš Brázdil
Publication date: 11 January 2018
Full work available at URL: https://arxiv.org/abs/1605.09703
unbiased estimationcontinuous-time Markov decision processesstatistical model checkingnonlinear population model
Related Items (2)
This page was built for publication: Policy learning for time-bounded reachability in continuous-time Markov decision processes via doubly-stochastic gradient ascent