Mathematical Research Data Initiative
Main page
Recent changes
Random page
Help about MediaWiki
Create a new Item
Create a new Property
Merge two items
In other projects
Discussion
View source
View history
Purge
English
Log in

Partially Observed Markov Decision Process Multiarmed Bandits—Structural Results

From MaRDI portal
Publication:3169035
Jump to:navigation, search

DOI10.1287/moor.1080.0371zbMath1231.90373OpenAlexW1998039896MaRDI QIDQ3169035

Vikram Krishnamurthy, B. Wahlberg

Publication date: 27 April 2011

Published in: Mathematics of Operations Research (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1287/moor.1080.0371


zbMATH Keywords

likelihood ratio orderingstochastic approximation algorithmopportunistic schedulingmonotone policiespartially observed Markov decision processmultiarmed bandits


Mathematics Subject Classification ID

Deterministic scheduling theory in operations research (90B35) Approximation methods and heuristics in mathematical programming (90C59) Markov and semi-Markov decision processes (90C40)


Related Items (3)

Ambiguous partially observable Markov decision processes: structural results and applications ⋮ Optimal Threshold Policies for Multivariate Stopping-Time POMDPs ⋮ Game of Thrones: Fully Distributed Learning for Multiplayer Bandits


Uses Software

  • POMDPS






This page was built for publication: Partially Observed Markov Decision Process Multiarmed Bandits—Structural Results

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:3169035&oldid=16416013"
Tools
What links here
Related changes
Special pages
Printable version
Permanent link
Page information
MaRDI portal item
This page was last edited on 4 February 2024, at 05:30.
Privacy policy
About MaRDI portal
Disclaimers
Imprint
Powered by MediaWiki