Pages that link to "Item:Q3116659"
From MaRDI portal
The following pages link to Solving Semi-Markov Decision Problems Using Average Reward Reinforcement Learning (Q3116659):
Displaying 23 items.
- On undiscounted semi-Markov decision processes with absorbing states (Q283987) (← links)
- New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system (Q320866) (← links)
- The optimal control of just-in-time-based production and distribution systems and performance comparisons with optimized pull systems (Q421584) (← links)
- Approximate dynamic programming for capacity allocation in the service industry (Q439484) (← links)
- Semi-Markov and reward fields (Q464467) (← links)
- A policy gradient method for semi-Markov decision processes with application to call admission control (Q859693) (← links)
- Model-based average reward reinforcement learning (Q1128769) (← links)
- Reinforcement learning for long-run average cost. (Q1427588) (← links)
- Minimizing mean weighted tardiness in unrelated parallel machine scheduling with reinforcement learning (Q1762118) (← links)
- A reinforcement learning algorithm based on policy iteration for average reward: Empirical results with yield management and convergence analysis (Q1771225) (← links)
- Reinforcement learning for joint pricing, lead-time and scheduling decisions in make-to-order systems (Q1926824) (← links)
- A performance-centred approach to optimising maintenance of complex systems (Q2030609) (← links)
- A sojourn-based approach to semi-Markov reinforcement learning (Q2149523) (← links)
- Application of reinforcement learning to the game of Othello (Q2462546) (← links)
- A heuristically accelerated reinforcement learning method for maintenance policy of an assembly line (Q2691260) (← links)
- An intelligent choice of witnesses in the Miller-Rabin primality test. Reinforcement learning approach (Q2700038) (← links)
- A Neurocomputational Model for Cocaine Addiction (Q3399365) (← links)
- Look-ahead control of conveyor-serviced production station by using potential-based online policy iteration (Q3654586) (← links)
- (Q4003930) (← links)
- The explicit form of the rate function for semi-Markov processes and its contractions (Q4642706) (← links)
- A simulation-based approach to study stochastic inventory-planning games (Q4668256) (← links)
- Representation and Timing in Theories of the Dopamine System (Q5476688) (← links)
- Logarithmic regret bounds for continuous-time average-reward Markov decision processes (Q6608781) (← links)