Immediate return preference emerged from a synaptic learning rule for return maximization
From MaRDI portal
Publication:889365
DOI10.1016/j.neunet.2014.04.004zbMath1368.92038OpenAlexW2043111235WikidataQ46890777 ScholiaQ46890777MaRDI QIDQ889365
Yoshiya Yamaguchi, Yutaka Sakai, Takeshi Aihara
Publication date: 6 November 2015
Published in: Neural Networks (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.neunet.2014.04.004
Cites Work
- Unnamed Item
- Unnamed Item
- A theoretical analysis of temporal difference learning in the iterated prisoner's dilemma game
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- Internal-Time Temporal Difference Model for Neural Value-Based Decision Making
- OnActor-Critic Algorithms
- Representation and Timing in Theories of the Dopamine System
This page was built for publication: Immediate return preference emerged from a synaptic learning rule for return maximization