Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
Limiting dynamics for Q-learning with memory one in symmetric two-player, two-action games - MaRDI portal

Limiting dynamics for Q-learning with memory one in symmetric two-player, two-action games

From MaRDI portal
Publication:6374049

arXiv2107.13995MaRDI QIDQ6374049

J. M. Meylahn, Lars A. L. Janssen

Publication date: 29 July 2021

Abstract: We develop a method based on computer algebra systems to represent the mutual pure strategy best-response dynamics of symmetric two-player, two-action repeated games played by players with a one-period memory. We apply this method to the iterated prisoner's dilemma, stag hunt and hawk-dove games and identify all possible equilibrium strategy pairs and the conditions for their existence. The only equilibrium strategy pair that is possible in all three games is the win-stay, lose-shift strategy. Lastly, we show that the mutual best-response dynamics are realized by a sample batch Q-learning algorithm in the infinite batch size limit.












This page was built for publication: Limiting dynamics for Q-learning with memory one in symmetric two-player, two-action games

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6374049)