The asymptotic equipartition property in reinforcement learning and its relation to return maximization
From MaRDI portal
Publication:2488678
DOI10.1016/j.neunet.2005.02.008zbMath1093.68082OpenAlexW2171630474WikidataQ51962962 ScholiaQ51962962MaRDI QIDQ2488678
Hideaki Sakai, Kazushi Ikeda, Kazunori Iwata
Publication date: 11 May 2006
Published in: Neural Networks (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.neunet.2005.02.008
Markov decision processinformation theoryreinforcement learningstochastic complexityasymptotic equipartition propertyreturn maximization
Related Items (1)
Cites Work
- A Mathematical Theory of Communication
- Asynchronous stochastic approximation and Q-learning
- Convergence results for single-step on-policy reinforcement-learning algorithms
- \({\mathcal Q}\)-learning
- The convergence of \(TD(\lambda)\) for general \(\lambda\)
- Universal coding with minimum probability of codeword length overflow
- Algorithmic Information Theory
- Statistical inference under multiterminal rate restrictions: a differential geometric approach
- The error exponent for the noiseless encoding of finite ergodic Markov sources
- Variable-to-fixed length codes provide better large deviations performance than fixed-to-variable length codes
- Algorithmic Information Theory
- Reliability function of a discrete memoryless channel at rates above capacity (Corresp.)
- On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
- The method of types [information theory]
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
This page was built for publication: The asymptotic equipartition property in reinforcement learning and its relation to return maximization