scientific article; zbMATH DE number 6860770
From MaRDI portal
Publication:4636970
zbMath1434.68446MaRDI QIDQ4636970
Publication date: 17 April 2018
Full work available at URL: http://jmlr.csail.mit.edu/papers/v18/15-251.html
Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.
reductionsimitation learningMonte-Carlo tree searchonline sequential decision-makingpartial policypartial policy learning
Decision theory (91B06) Learning and adaptive systems in artificial intelligence (68T05) Markov and semi-Markov decision processes (90C40) Online algorithms; streaming algorithms (68W27)
Related Items (2)
A synthesis of automated planning and reinforcement learning for efficient, robust decision-making ⋮ Unnamed Item
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Depth-first iterative-deepening: An optimal admissible tree search
- Landmark learning: An illustration of associative search
- A sparse sampling algorithm for near-optimal planning in large Markov decision processes
- A Concise Introduction to Models and Methods for Automated Planning
- A Monte-Carlo AIXI Approximation
- PROGRESSIVE STRATEGIES FOR MONTE-CARLO TREE SEARCH
- Finite-time analysis of the multiarmed bandit problem
This page was built for publication: