Pages that link to "Item:Q1345139"
From MaRDI portal
The following pages link to Asynchronous stochastic approximation and Q-learning (Q1345139):
Displaying 50 items.
- Multiscale Q-learning with linear function approximation (Q312650) (← links)
- Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design (Q313259) (← links)
- Perspectives of approximate dynamic programming (Q333093) (← links)
- Q-learning and policy iteration algorithms for stochastic shortest path problems (Q378731) (← links)
- Iterative learning control for large scale nonlinear systems with observation noise (Q445110) (← links)
- Stabilization of stochastic approximation by step size adaptation (Q450652) (← links)
- Adaptive dynamic programming and optimal control of nonlinear nonaffine systems (Q472591) (← links)
- Q-learning for continuous-time linear systems: A model-free infinite horizon optimal control approach (Q511735) (← links)
- The Borkar-Meyn theorem for asynchronous stochastic approximations (Q553371) (← links)
- Actor-critic algorithms for hierarchical Markov decision processes (Q856510) (← links)
- Reinforcement learning algorithms with function approximation: recent advances and applications (Q903601) (← links)
- Q-learning algorithms with random truncation bounds and applications to effective parallel computing (Q946195) (← links)
- Natural actor-critic algorithms (Q1049136) (← links)
- Stochastic approximation with two time scales (Q1391875) (← links)
- Reinforcement learning for long-run average cost. (Q1427588) (← links)
- An adaptive learning model with foregone payoff information (Q1674985) (← links)
- Generalization of a result of Fabian on the asymptotic normality of stochastic approximation (Q1716693) (← links)
- A unified framework for stochastic optimization (Q1719609) (← links)
- Linear least-squares algorithms for temporal difference learning (Q1911340) (← links)
- Feature-based methods for large scale dynamic programming (Q1911341) (← links)
- Reinforcement learning with replacing eligibility traces (Q1911343) (← links)
- The loss from imperfect value functions in exceptation-based and minimax-based tasks (Q1911345) (← links)
- Error bounds for constant step-size \(Q\)-learning (Q1932736) (← links)
- Approximate stochastic annealing for online control of infinite horizon Markov decision processes (Q1937498) (← links)
- Fully asynchronous policy evaluation in distributed reinforcement learning over networks (Q2063869) (← links)
- Revisiting the ODE method for recursive algorithms: fast convergence using quasi stochastic approximation (Q2070010) (← links)
- Reinforcement learning and stochastic optimisation (Q2072112) (← links)
- An application of approximate dynamic programming in multi-period multi-product advertising budgeting (Q2083404) (← links)
- Inhomogeneous deep Q-network for time sensitive applications (Q2093364) (← links)
- Fundamental design principles for reinforcement learning algorithms (Q2094028) (← links)
- Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning (Q2097782) (← links)
- Neural circuits for learning context-dependent associations of stimuli (Q2182880) (← links)
- Structural estimation of real options models (Q2271671) (← links)
- Convergence results on stochastic adaptive learning (Q2305048) (← links)
- An information-theoretic analysis of return maximization in reinforcement learning (Q2375396) (← links)
- Online calibrated forecasts: memory efficiency versus universality for learning in games (Q2384142) (← links)
- New algorithms of the Q-learning type (Q2440701) (← links)
- The asymptotic equipartition property in reinforcement learning and its relation to return maximization (Q2488678) (← links)
- An optimal control approach to mode generation in hybrid systems (Q2499632) (← links)
- Boundedness of iterates in \(Q\)-learning (Q2504669) (← links)
- An asynchronous stochastic approximation theorem and some applications (Q2707252) (← links)
- An approximate dynamic programming algorithm for monotone value functions (Q2797467) (← links)
- A simulation-based approach to stochastic dynamic programming (Q2863720) (← links)
- Exploiting the structural properties of the underlying Markov decision problem in the Q-learning algorithm (Q2901012) (← links)
- On Generalized Bellman Equations and Temporal-Difference Learning (Q3305109) (← links)
- Optimal Hour-Ahead Bidding in the Real-Time Electricity Market with Battery Storage Using Approximate Dynamic Programming (Q3458751) (← links)
- Learning PDFA with Asynchronous Transitions (Q3588386) (← links)
- On the Convergence of Stochastic Iterative Dynamic Programming Algorithms (Q4323346) (← links)
- Partially asynchronous co-state prediction algorithm (Q4841825) (← links)
- Bayesian Exploration for Approximate Dynamic Programming (Q4971589) (← links)