The following pages link to (Q4637066):
Displaying 5 items.
- Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm (Q2514758) (← links)
- A Survey of Preference-Based Online Learning with Bandit Algorithms (Q2938721) (← links)
- (Q5148957) (← links)
- Reward (Mis)design for autonomous driving (Q6098840) (← links)
- Preference learning and multiple criteria decision aiding: differences, commonalities, and synergies. II (Q6614639) (← links)