RUDDER: Return Decomposition for Delayed Rewards (Q6303288)

From MaRDI portal





preprint article from arXiv
Language Label Description Also known as
English
RUDDER: Return Decomposition for Delayed Rewards
preprint article from arXiv

    Statements

    20 June 2018
    0 references
    cs.LG
    0 references
    cs.AI
    0 references
    math.OC
    0 references
    stat.ML
    0 references
    Jose A. Arjona-Medina
    0 references
    Michael Gillhofer
    0 references
    Michael Widrich
    0 references
    Thomas Unterthiner
    0 references
    Johannes Brandstetter
    0 references
    Sepp Hochreiter
    0 references

    Identifiers

    0 references