A learning algorithm for communicating Markov decision processes with unknown transition matrices (Q2844160)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: A learning algorithm for communicating Markov decision processes with unknown transition matrices |
scientific article; zbMATH DE number 6202348
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | A learning algorithm for communicating Markov decision processes with unknown transition matrices |
scientific article; zbMATH DE number 6202348 |
Statements
28 August 2013
0 references
adaptive policy
0 references
average case
0 references
communicating case
0 references
learning algorithm
0 references
Markov decision processes
0 references
reward-penalty type
0 references
unknown transition matrix
0 references
A learning algorithm for communicating Markov decision processes with unknown transition matrices (English)
0 references
0.8435014486312866
0 references
0.7895026803016663
0 references