Pages that link to "Item:Q3116659"

What links here

⧼whatlinkshere-whatlinkshere-target⧽

Page:

⧼whatlinkshere-whatlinkshere-ns⧽

Namespace:

Invert selection

⧼whatlinkshere-whatlinkshere-filter⧽

Hide transclusions

Hide links

Hide redirects

The following pages link to Solving Semi-Markov Decision Problems Using Average Reward Reinforcement Learning (Q3116659):

Displaying 23 items.

On undiscounted semi-Markov decision processes with absorbing states (Q283987) (← links)
New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system (Q320866) (← links)
The optimal control of just-in-time-based production and distribution systems and performance comparisons with optimized pull systems (Q421584) (← links)
Approximate dynamic programming for capacity allocation in the service industry (Q439484) (← links)
Semi-Markov and reward fields (Q464467) (← links)
A policy gradient method for semi-Markov decision processes with application to call admission control (Q859693) (← links)
Model-based average reward reinforcement learning (Q1128769) (← links)
Reinforcement learning for long-run average cost. (Q1427588) (← links)
Minimizing mean weighted tardiness in unrelated parallel machine scheduling with reinforcement learning (Q1762118) (← links)
A reinforcement learning algorithm based on policy iteration for average reward: Empirical results with yield management and convergence analysis (Q1771225) (← links)
Reinforcement learning for joint pricing, lead-time and scheduling decisions in make-to-order systems (Q1926824) (← links)
A performance-centred approach to optimising maintenance of complex systems (Q2030609) (← links)
A sojourn-based approach to semi-Markov reinforcement learning (Q2149523) (← links)
Application of reinforcement learning to the game of Othello (Q2462546) (← links)
A heuristically accelerated reinforcement learning method for maintenance policy of an assembly line (Q2691260) (← links)
An intelligent choice of witnesses in the Miller-Rabin primality test. Reinforcement learning approach (Q2700038) (← links)
A Neurocomputational Model for Cocaine Addiction (Q3399365) (← links)
Look-ahead control of conveyor-serviced production station by using potential-based online policy iteration (Q3654586) (← links)
(Q4003930) (← links)
The explicit form of the rate function for semi-Markov processes and its contractions (Q4642706) (← links)
A simulation-based approach to study stochastic inventory-planning games (Q4668256) (← links)
Representation and Timing in Theories of the Dopamine System (Q5476688) (← links)
Logarithmic regret bounds for continuous-time average-reward Markov decision processes (Q6608781) (← links)