The following pages link to Recurrent policy gradients (Q3588966):
Displaying 9 items.
- The factored policy-gradient planner (Q835832) (← links)
- Totally model-free actor-critic recurrent neural-network reinforcement learning in non-Markovian domains (Q1699932) (← links)
- Real-time reinforcement learning by sequential actor-critics and experience replay (Q1784532) (← links)
- Multikernel recursive least-squares temporal difference learning (Q1990335) (← links)
- Machine learning for combinatorial optimization: a methodological tour d'horizon (Q2029358) (← links)
- Reward-weighted regression with sample reuse for direct policy search in reinforcement learning (Q2887009) (← links)
- (Q5054599) (← links)
- Toward Training Recurrent Neural Networks for Lifelong Learning (Q5131161) (← links)
- Finite-time analysis of natural actor-critic for POMDPs (Q6633040) (← links)