Planning and acting in partially observable stochastic domains
From MaRDI portal
Publication:72343
DOI10.1016/s0004-3702(98)00023-xzbMath0908.68165OpenAlexW2168359464WikidataQ56602944 ScholiaQ56602944MaRDI QIDQ72343
Anthony R. Cassandra, Leslie Pack Kaelbling, Michael L. Littman, Leslie Pack Kaelbling, Michael L. Littman, Anthony R. Cassandra
Publication date: May 1998
Published in: Artificial Intelligence (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/s0004-3702(98)00023-x
Related Items (only showing first 100 items - show all)
The Concept of Opposition and Its Use in Q-Learning and Q(λ) Techniques ⋮ Unnamed Item ⋮ Large-scale financial planning via a partially observable stochastic dual dynamic programming framework ⋮ Learning-based state estimation and control using MHE and MPC schemes with imperfect models ⋮ Reward prediction errors, not sensory prediction errors, play a major role in model selection in human reinforcement learning ⋮ Approximability and efficient algorithms for constrained fixed-horizon POMDPs with durative actions ⋮ A conflict-directed approach to chance-constrained mixed logical linear programming ⋮ Risk-aware shielding of partially observable Monte Carlo planning policies ⋮ Simultaneous perception-action design via invariant finite belief sets ⋮ Bridging Commonsense Reasoning and Probabilistic Planning via a Probabilistic Action Language ⋮ Reward Maximization Through Discrete Active Inference ⋮ POMDP controllers with optimal budget ⋮ Epistemic uncertainty aware semantic localization and mapping for inference and belief space planning ⋮ A Markovian model for the spread of the SARS-CoV-2 virus ⋮ Behavioral model summarisation for other agents under uncertainty ⋮ Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective ⋮ Off-policy evaluation in partially observed Markov decision processes under sequential ignorability ⋮ Unnamed Item ⋮ Using Machine Learning for Decreasing State Uncertainty in Planning ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Minimax real-time heuristic search ⋮ Unnamed Item ⋮ Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation ⋮ Unnamed Item ⋮ A reinforcement learning scheme for a partially-observable multi-agent game ⋮ A reinforcement learning scheme for a partially-observable multi-agent game ⋮ Representation and Timing in Theories of the Dopamine System ⋮ General Value Function Networks ⋮ A Sufficient Statistic for Influence in Structured Multiagent Environments ⋮ Constrained Multiagent Markov Decision Processes: a Taxonomy of Problems and Algorithms ⋮ Induction and Exploitation of Subgoal Automata for Reinforcement Learning ⋮ Strategy Graphs for Influence Diagrams ⋮ Task-Aware Verifiable RNN-Based Policies for Partially Observable Markov Decision Processes ⋮ Analyzing generalized planning under nondeterminism ⋮ An evidential approach to SLAM, path planning, and active exploration ⋮ Meeting a deadline: shortest paths on stochastic directed acyclic graphs with information gathering ⋮ Goal-directed learning of features and forward models ⋮ Enforcing almost-sure reachability in POMDPs ⋮ Finite-horizon LQR controller for partially-observed Boolean dynamical systems ⋮ Myopic Bounds for Optimal Policy of POMDPs: An Extension of Lovejoy’s Structural Results ⋮ Optimal speech motor control and token-to-token variability: a Bayesian modeling approach ⋮ Dynamic multiagent probabilistic inference ⋮ Autonomous agents modelling other agents: a comprehensive survey and open problems ⋮ A two-state partially observable Markov decision process with three actions ⋮ Gradient-descent for randomized controllers under partial observability ⋮ A synthesis of automated planning and reinforcement learning for efficient, robust decision-making ⋮ Learning Where to Attend with Deep Architectures for Image Tracking ⋮ Recognizing and learning models of social exchange strategies for the regulation of social interactions in open agent societies ⋮ Active inference and agency: optimal control without cost functions ⋮ Tutorial series on brain-inspired computing. IV: Reinforcement learning: machine learning and natural learning ⋮ Open problems in universal induction \& intelligence ⋮ Cost-sensitive feature acquisition and classification ⋮ Planning for multiple measurement channels in a continuous-state POMDP ⋮ Simultaneous learning and planning in a hierarchical control system for a cognitive agent ⋮ Partially observable multistage stochastic programming ⋮ Multi-stage classifier design ⋮ Learning to steer nonlinear interior-point methods ⋮ Representations for robot knowledge in the \textsc{KnowRob} framework ⋮ Robotic manipulation of multiple objects as a POMDP ⋮ Geometric backtracking for combined task and motion planning in robotic systems ⋮ Supervisor synthesis of POMDP via automata learning ⋮ Exact decomposition approaches for Markov decision processes: a survey ⋮ Testing probabilistic equivalence through reinforcement learning ⋮ BOUNDED-PARAMETER PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES: FRAMEWORK AND ALGORITHM ⋮ Planning in partially-observable switching-mode continuous domains ⋮ Optimal management of stochastic invasion in a metapopulation with Allee effects ⋮ Quantitative controller synthesis for consumption Markov decision processes ⋮ Rationalizing predictions by adversarial information calibration ⋮ Unnamed Item ⋮ Probabilistic may/must testing: retaining probabilities by restricted schedulers ⋮ Affect control processes: intelligent affective interaction using a partially observable Markov decision process ⋮ Dynamic optimization over infinite-time horizon: web-building strategy in an orb-weaving spider as a case study ⋮ Multi-goal motion planning using traveling salesman problem in belief space ⋮ Privacy stochastic games in distributed constraint reasoning ⋮ Policy iteration for bounded-parameter POMDPs ⋮ Probabilistic reasoning about epistemic action narratives ⋮ Reasoning and predicting POMDP planning complexity via covering numbers ⋮ Computing rank dependent utility in graphical models for sequential decision problems ⋮ Exploiting symmetries for single- and multi-agent partially observable stochastic domains ⋮ Decentralized MDPs with sparse interactions ⋮ State observation accuracy and finite-memory policy performance ⋮ An online multi-agent co-operative learning algorithm in POMDPs ⋮ Markov limid processes for representing and solving renewal problems ⋮ Computation of weighted sums of rewards for concurrent MDPs ⋮ Strong planning under partial observability ⋮ Solving for Best Responses and Equilibria in Extensive-Form Games with Reinforcement Learning Methods ⋮ An affective mobile robot educator with a full-time job ⋮ Markov decision processes with sequential sensor measurements ⋮ Counterexample-guided inductive synthesis for probabilistic systems ⋮ Algorithms and conditional lower bounds for planning problems ⋮ A survey of inverse reinforcement learning: challenges, methods and progress ⋮ An integrated approach to solving influence diagrams and finite-horizon partially observable decision processes ⋮ Task-structured probabilistic I/O automata ⋮ Posterior Weighted Reinforcement Learning with State Uncertainty ⋮ Integration of Reinforcement Learning and Optimal Decision-Making Theories of the Basal Ganglia ⋮ The value of information for populations in varying environments ⋮ pomdpSolve ⋮ Conformant plans and beyond: principles and complexity ⋮ Performance prediction of an unmanned airborne vehicle multi-agent system
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Application of Jensen's inequality to adaptive suboptimal design
- A survey of solution techniques for the partially observed Markov decision process
- The complexity of stochastic games
- The complexity of mean payoff games on graphs
- Fast planning through planning graph analysis
- Optimal control of Markov processes with incomplete state information
- A survey of algorithmic methods for partially observed Markov decision processes
- Solving H-horizon, stationary Markov decision problems in time proportional to log (H)
- The Optimal Search for a Moving Target When the Search Path Is Constrained
- State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms
- The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs
- OPTIMAL CONTROL FOR PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES OVER AN INFINITE HORIZON
- The Optimal Control of Partially Observable Markov Processes over a Finite Horizon
- Solution Procedures for Partially Observed Markov Decision Processes
This page was built for publication: Planning and acting in partially observable stochastic domains