Finding optimal memoryless policies of POMDPs under the expected average reward criterion (Q418072)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Finding optimal memoryless policies of POMDPs under the expected average reward criterion |
scientific article; zbMATH DE number 6034961
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Finding optimal memoryless policies of POMDPs under the expected average reward criterion |
scientific article; zbMATH DE number 6034961 |
Statements
Finding optimal memoryless policies of POMDPs under the expected average reward criterion (English)
0 references
14 May 2012
0 references
POMDPs
0 references
performance difference
0 references
policy iteration with step sizes
0 references
correlated actions
0 references
memoryless policy
0 references
0 references
0.9293899
0 references
0.88980615
0 references
0.87919104
0 references
0.86899376
0 references
0.8688233
0 references
0.8673892
0 references
0.86641335
0 references