RUDDER: Return Decomposition for Delayed Rewards (Q6303288)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: RUDDER: Return Decomposition for Delayed Rewards |
preprint article from arXiv
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | RUDDER: Return Decomposition for Delayed Rewards |
preprint article from arXiv |
Statements
20 June 2018
0 references
cs.LG
0 references
cs.AI
0 references
math.OC
0 references
stat.ML
0 references
Jose A. Arjona-Medina
0 references
Michael Gillhofer
0 references
Michael Widrich
0 references
Thomas Unterthiner
0 references
Johannes Brandstetter
0 references
Sepp Hochreiter
0 references