Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm
From MaRDI portal
Publication:2514758
DOI10.1007/s10994-014-5458-8zbMath1319.68170OpenAlexW2032950725WikidataQ115146321 ScholiaQ115146321MaRDI QIDQ2514758
Weiwei Cheng, Eyke Hüllermeier, Paul Weng, Róbert Busa-Fekete, Balázs Szörényi
Publication date: 3 February 2015
Published in: Machine Learning (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/s10994-014-5458-8
Learning and adaptive systems in artificial intelligence (68T05) Problem solving in the context of artificial intelligence (heuristics, search strategies, etc.) (68T20)
Related Items (2)
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- The \(K\)-armed dueling bandits problem
- Uniform quasi-concavity in probabilistic constrained stochastic programming
- Nontransitive measurable utility
- A Bernstein-type inequality for \(U\)-statistics and \(U\)-processes
- Tournament solutions and majority voting
- Evolution strategies. A comprehensive introduction
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- Preference-based reinforcement learning: a formal framework and a policy iteration algorithm
- Approximation Theorems of Mathematical Statistics
- Tuning Bandit Algorithms in Stochastic Environments
- Preference Learning
- Algorithms for Reinforcement Learning
- Neuroevolution strategies for episodic reinforcement learning
- Probability Inequalities for Sums of Bounded Random Variables
- Evolutionary Algorithms for Solving Multi-Objective Problems
- Note on Wilcoxon's Two-Sample Test when Ties are Present
- Finite-time analysis of the multiarmed bandit problem
This page was built for publication: Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm