Explicit explore, exploit, or escape \((E^4)\): near-optimal safety-constrained reinforcement learning in polynomial time (Q6106432)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Explicit explore, exploit, or escape \((E^4)\): near-optimal safety-constrained reinforcement learning in polynomial time |
scientific article; zbMATH DE number 7702687
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Explicit explore, exploit, or escape \((E^4)\): near-optimal safety-constrained reinforcement learning in polynomial time |
scientific article; zbMATH DE number 7702687 |
Statements
Explicit explore, exploit, or escape \((E^4)\): near-optimal safety-constrained reinforcement learning in polynomial time (English)
0 references
27 June 2023
0 references
safe artificial intelligence
0 references
safe exploration
0 references
model-based reinforcement learning
0 references
constrained Markov decision processes
0 references
robust Markov decision processes
0 references
0 references
0.8621559
0 references
0.84374815
0 references
0.84277457
0 references
0.84090304
0 references
0.8383003
0 references
0.8337397
0 references