Gradient temporal-difference learning for off-policy evaluation using emphatic weightings (Q6146179)
From MaRDI portal
scientific article; zbMATH DE number 7786202
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Gradient temporal-difference learning for off-policy evaluation using emphatic weightings |
scientific article; zbMATH DE number 7786202 |
Statements
Gradient temporal-difference learning for off-policy evaluation using emphatic weightings (English)
0 references
10 January 2024
0 references
reinforcement learning
0 references
off-policy evaluation
0 references
temporal-difference learning
0 references
gradient temporal-difference learning
0 references
emphatic approach
0 references