scientific article; zbMATH DE number 7164728
From MaRDI portal
Publication:5214220
zbMath1434.68394arXiv1703.07940MaRDI QIDQ5214220
Publication date: 7 February 2020
Full work available at URL: https://arxiv.org/abs/1703.07940
Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Learning and adaptive systems in artificial intelligence (68T05) Dynamic programming (90C39) Sequential statistical analysis (62L10)
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Planning and acting in partially observable stochastic domains
- Extreme state aggregation beyond Markov decision processes
- Adaptive aggregation for reinforcement learning in average reward Markov decision processes
- Approximate dynamic programming with a fuzzy parameterization
- Variable resolution discretization in optimal control
- \({\mathcal Q}\)-learning
- Adaptive-resolution reinforcement learning with polynomial exploration in deterministic domains
- Basis function adaptation in temporal difference reinforcement learning
- Approximate Dynamic Programming
- Markov Chains and Stochastic Stability
- A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
- Inequalities: theory of majorization and its applications
This page was built for publication: