The following pages link to (Q4969098):
Displaying 11 items.
- The factored policy-gradient planner (Q835832) (← links)
- Estimation and approximation bounds for gradient-based reinforcement learning (Q1604222) (← links)
- Importance sampling in reinforcement learning with an estimated behavior policy (Q2051319) (← links)
- Multi-agent reinforcement learning: a selective overview of theories and algorithms (Q2094040) (← links)
- Bayesian policy gradient and actor-critic algorithms (Q2810874) (← links)
- Reward-weighted regression with sample reuse for direct policy search in reinforcement learning (Q2887009) (← links)
- (Q4533363) (← links)
- Rejoinder: New Objectives for Policy Learning (Q4999146) (← links)
- (Q5148932) (← links)
- Smoothing policies and safe policy gradients (Q6097096) (← links)
- DSMC evaluation stages: fostering robust and safe behavior in deep reinforcement learning -- extended version (Q6599368) (← links)