scientific article; zbMATH DE number 5037124
From MaRDI portal
Publication:5477863
DOI10.1023/A:1018064306595zbMath1099.68692MaRDI QIDQ5477863
Publication date: 29 June 2006
Published in: Machine Learning (Search for Journal in Brave)
Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Learning and adaptive systems in artificial intelligence (68T05) Stochastic learning and adaptive control (93E35) Markov and semi-Markov decision processes (90C40)
Related Items
Hybrid MDP based integrated hierarchical Q-learning ⋮ Job control in heterogeneous computing systems ⋮ Model-based average reward reinforcement learning ⋮ Reinforcement learning for joint pricing, lead-time and scheduling decisions in make-to-order systems ⋮ \(R(\lambda)\) imitation learning for automatic generation control of interconnected power grids ⋮ Multi-agent natural actor-critic reinforcement learning algorithms ⋮ Optimal Curiosity-Driven Modular Incremental Slow Feature Analysis ⋮ Reinforcement learning for long-run average cost. ⋮ SOLVING DYNAMIC WILDLIFE RESOURCE OPTIMIZATION PROBLEMS USING REINFORCEMENT LEARNING ⋮ Analyzing anonymity attacks through noisy channels ⋮ Unnamed Item ⋮ Minimizing mean weighted tardiness in unrelated parallel machine scheduling with reinforcement learning ⋮ Long-Term Reward Prediction in TD Models of the Dopamine System ⋮ A construction algorithm for designing guide paths of automated guided vehicle systems ⋮ Importance sampling in reinforcement learning with an estimated behavior policy ⋮ A Neurocomputational Model for Cocaine Addiction ⋮ Off-Policy Estimation of Long-Term Average Outcomes With Applications to Mobile Health ⋮ Batch policy learning in average reward Markov decision processes