Reliability of internal prediction/estimation and its application. I: Adaptive action selection reflecting reliability of value function
From MaRDI portal
Publication:1886590
DOI10.1016/J.NEUNET.2004.05.004zbMath1067.68591OpenAlexW2009424996WikidataQ40489238 ScholiaQ40489238MaRDI QIDQ1886590
Yutaka Sakaguchi, Mitsuo Takano
Publication date: 18 November 2004
Published in: Neural Networks (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.neunet.2004.05.004
ReliabilityMeta-learningTD learningDiscount rateExploration-exploitation balanceInternal predictionModel-free reinforcement learningTemperature parameter
Related Items (1)
Cites Work
- Dual-control theory. I
- The apparent conflict between estimation and control - a survey of the two-armed bandit problem
- A near-optimal polynomial time algorithm for learning in certain classes of stochastic games
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- \({\mathcal Q}\)-learning
- Mean, variance and probabilistic criteria in finite Markov decision processes: A review
- Reliability of internal prediction/estimation and its application. I: Adaptive action selection reflecting reliability of value function
This page was built for publication: Reliability of internal prediction/estimation and its application. I: Adaptive action selection reflecting reliability of value function