Policy learning for time-bounded reachability in continuous-time Markov decision processes via doubly-stochastic gradient ascent (Q1693106)
From MaRDI portal
scientific article
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Policy learning for time-bounded reachability in continuous-time Markov decision processes via doubly-stochastic gradient ascent |
scientific article |
Statements
Policy learning for time-bounded reachability in continuous-time Markov decision processes via doubly-stochastic gradient ascent (English)
0 references
11 January 2018
0 references
continuous-time Markov decision processes
0 references
statistical model checking
0 references
unbiased estimation
0 references
nonlinear population model
0 references