Exploratory HJB Equations and Their Convergence
From MaRDI portal
Publication:5047935
DOI10.1137/21M1448185zbMath1501.35132arXiv2109.10269OpenAlexW4309029392MaRDI QIDQ5047935
No author found.
Publication date: 17 November 2022
Published in: SIAM Journal on Control and Optimization (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2109.10269
stochastic controlsimulated annealingreinforcement learningoverdamped Langevin equationentropy regularizationexploratory controlHamilton-Jacobi-Bellmann (HJB) equations
Optimal stochastic control (93E20) Stochastic stability in control theory (93E15) Diffusion processes (60J60) Hamilton-Jacobi equations (35F21)
Related Items (2)
Choquet Regularization for Continuous-Time Reinforcement Learning ⋮ Exploratory Control with Tsallis Entropy for Latent Factor Models
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Comparison principle for unbounded viscosity solutions of degenerate elliptic PDEs with gradient superlinear terms
- Viscosity solutions of fully nonlinear second-order elliptic partial differential equations
- \(W^{1,p}\)-interior estimates for solutions of fully nonlinear, uniformly elliptic equations
- The Poisson equation and estimates for distances between stationary distributions of diffusions
- Exploratory LQG mean field games with entropy regularization
- Exponential ergodicity and convergence for generalized reflected Brownian motion
- Viscosity solutions of general viscous Hamilton-Jacobi equations
- The Kantorovich and variation distances between invariant measures of diffusions and nonlinear stationary Fokker-Planck-Kolmogorov equations
- Controlled Markov processes and viscosity solutions
- Stability of Markovian processes II: continuous-time processes and sampled chains
- Stability of Markovian processes III: Foster–Lyapunov criteria for continuous-time processes
- Regularity and Stability of Feedback Relaxed Controls
- Fokker–Planck–Kolmogorov Equations
- On the regularity theory of fully nonlinear parabolic equations: I
- On the regularity theory of fully nonlinear parabolic equations: II
- User’s guide to viscosity solutions of second order partial differential equations
- Nonuniqueness for Second-Order Elliptic Equations with Measurable Coefficients
- Lp- Theory for fully nonlinear uniformly parabolic equations
- Distances between Stationary Distributions of Diffusions and Solvability of Nonlinear Fokker--Planck--Kolmogorov Equations
- State-Dependent Temperature Control for Langevin Diffusions
- Bounds for the fundamental solution of a parabolic equation
- The Alexandrov-Bakelman-Pucci Weak Maximum Principle for Fully Nonlinear Equations in Unbounded Domains
- Continuous‐time mean–variance portfolio selection: A reinforcement learning framework
- Entropy Regularization for Mean Field Games with Learning
This page was built for publication: Exploratory HJB Equations and Their Convergence