Pages that link to "Item:Q2504518"

What links here

⧼whatlinkshere-whatlinkshere-target⧽

Page:

⧼whatlinkshere-whatlinkshere-ns⧽

Namespace:

Invert selection

⧼whatlinkshere-whatlinkshere-filter⧽

Hide transclusions

Hide links

Hide redirects

The following pages link to An actor-critic algorithm for constrained Markov decision processes (Q2504518):

Displaying 25 items.

A constrained optimization perspective on actor-critic algorithms and application to network routing (Q286519) (← links)
An online actor-critic algorithm with function approximation for constrained Markov decision processes (Q438776) (← links)
Quasi-Newton smoothed functional algorithms for unconstrained and constrained simulation optimization (Q523576) (← links)
An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes (Q616967) (← links)
A new learning algorithm for optimal stopping (Q839001) (← links)
Actor-critic algorithms for hierarchical Markov decision processes (Q856510) (← links)
Random search for constrained Markov decision processes with multi-policy improvement (Q895275) (← links)
Variance-constrained actor-critic algorithms for discounted and average reward MDPs (Q1689603) (← links)
Approachability in Stackelberg stochastic games with vector costs (Q1707454) (← links)
Delay-aware online service scheduling in high-speed railway communication systems (Q1717936) (← links)
Whittle index based Q-learning for restless bandits with average reward (Q2116660) (← links)
Learning algorithms for finite horizon constrained Markov decision processes (Q2468856) (← links)
A note on linear function approximation using random projections (Q2519761) (← links)
A convergent online single time scale actor critic algorithm (Q2896031) (← links)
基于对称扰动采样的Actor-critic 算法 (Q2992408) (← links)
A least squares temporal difference actor–critic algorithm with applications to warehouse management (Q3120552) (← links)
Natural Actor-Critic based on batch recursive least-squares (Q3461512) (← links)
Opportunistic Transmission over Randomly Varying Channels (Q3616977) (← links)
Risk-Constrained Reinforcement Learning with Percentile Risk Criteria (Q4558492) (← links)
Artificial Intelligence and Soft Computing - ICAISC 2004 (Q4666259) (← links)
Finite-Time Analysis and Restarting Scheme for Linear Two-Time-Scale Stochastic Approximation (Q5009779) (← links)
Risk-Sensitive Reinforcement Learning via Policy Gradient Search (Q5102286) (← links)
Optimal Distributed Uplink Channel Allocation: A Constrained MDP Formulation (Q5198538) (← links)
An Actor-Critic Algorithm With Second-Order Actor and Critic (Q5352622) (← links)
Safety-constrained reinforcement learning with a distributional safety critic (Q6106435) (← links)