Learning One Representation to Optimize All Rewards (Q6362810)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Learning One Representation to Optimize All Rewards |
preprint article from arXiv
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Learning One Representation to Optimize All Rewards |
preprint article from arXiv |
Statements
14 March 2021
0 references
cs.LG
0 references
cs.AI
0 references
math.OC
0 references
Ahmed Touati
0 references
Yann Ollivier
0 references