Reinforcement learning from comparisons: three alternatives are enough, two are not
From MaRDI portal
Publication:1688021
DOI10.1214/16-AAP1271zbMath1379.60081OpenAlexW2962773167MaRDI QIDQ1688021
Jean-François Laslier, Benoît Laslier
Publication date: 4 January 2018
Published in: The Annals of Applied Probability (Search for Journal in Brave)
Full work available at URL: https://projecteuclid.org/euclid.aoap/1509696037
Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.) (60J20) Memory and learning in psychology (91E40) Evolutionary games (91A22)
Related Items (3)
An urn model with random multiple drawing and random addition ⋮ Asymptotic behaviour of the one-dimensional ``rock-paper-scissors cyclic cellular automaton ⋮ Multiple drawing multi-colour urns by stochastic approximation
This page was built for publication: Reinforcement learning from comparisons: three alternatives are enough, two are not