On the convergence of the MLE as an estimator of the learning rate in the Exp3 algorithm
From MaRDI portal
Publication:6436223
arXiv2305.06660MaRDI QIDQ6436223
Author name not available (Why is that?)
Publication date: 11 May 2023
Abstract: When fitting the learning data of an individual to algorithm-like learning models, the observations are so dependent and non-stationary that one may wonder what the classical Maximum Likelihood Estimator (MLE) could do, even if it is the usual tool applied to experimental cognition. Our objective in this work is to show that the estimation of the learning rate cannot be efficient if the learning rate is constant in the classical Exp3 (Exponential weights for Exploration and Exploitation) algorithm. Secondly, we show that if the learning rate decreases polynomially with the sample size, then the prediction error and in some cases the estimation error of the MLE satisfy bounds in probability that decrease at a polynomial rate.
Has companion code repository: https://github.com/JulienAubert3/Exp3R
This page was built for publication: On the convergence of the MLE as an estimator of the learning rate in the Exp3 algorithm
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6436223)