A theoretical analysis of temporal difference learning in the iterated prisoner's dilemma game

From MaRDI portal

Publication:1048261

Jump to:navigation, search

DOI10.1007/s11538-009-9424-8zbMath1182.91048OpenAlexW2073008835WikidataQ39975754 ScholiaQ39975754MaRDI QIDQ1048261

Naoki Masuda, Hisashi Ohtsuki

Publication date: 11 January 2010

Published in: Bulletin of Mathematical Biology (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1007/s11538-009-9424-8

zbMATH Keywords

reinforcement learning cooperation prisoner's dilemma direct reciprocity

Mathematics Subject Classification ID

Cooperative games (91A12) Models of societies, social and urban evolution (91D10) Memory and learning in psychology (91E40) Rationality and learning in game theory (91A26)

Related Items (4)

Immediate return preference emerged from a synaptic learning rule for return maximization ⋮ Global migration can lead to stronger spatial selection than local migration ⋮ Numerical analysis of a reinforcement learning model with the dynamic aspiration level in the iterated prisoner's dilemma ⋮ Evolution of cooperation facilitated by reinforcement learning with adaptive aspiration levels

Cites Work

This page was built for publication: A theoretical analysis of temporal difference learning in the iterated prisoner's dilemma game

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:1048261&oldid=13062075"