Planning and acting in partially observable stochastic domains

DOI10.1016/s0004-3702(98)00023-xzbMath0908.68165OpenAlexW2168359464WikidataQ56602944 ScholiaQ56602944MaRDI QIDQ72343

Anthony R. Cassandra, Leslie Pack Kaelbling, Michael L. Littman, Leslie Pack Kaelbling, Michael L. Littman, Anthony R. Cassandra

Publication date: May 1998

Published in: Artificial Intelligence (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1016/s0004-3702(98)00023-x

zbMATH Keywords

uncertainty planning partially observable Markov decision processes

Mathematics Subject Classification ID

Problem solving in the context of artificial intelligence (heuristics, search strategies, etc.) (68T20)

Related Items (only showing first 100 items - show all)

The Concept of Opposition and Its Use in Q-Learning and Q(λ) Techniques ⋮ Unnamed Item ⋮ Large-scale financial planning via a partially observable stochastic dual dynamic programming framework ⋮ Learning-based state estimation and control using MHE and MPC schemes with imperfect models ⋮ Reward prediction errors, not sensory prediction errors, play a major role in model selection in human reinforcement learning ⋮ Approximability and efficient algorithms for constrained fixed-horizon POMDPs with durative actions ⋮ A conflict-directed approach to chance-constrained mixed logical linear programming ⋮ Risk-aware shielding of partially observable Monte Carlo planning policies ⋮ Simultaneous perception-action design via invariant finite belief sets ⋮ Bridging Commonsense Reasoning and Probabilistic Planning via a Probabilistic Action Language ⋮ Reward Maximization Through Discrete Active Inference ⋮ POMDP controllers with optimal budget ⋮ Epistemic uncertainty aware semantic localization and mapping for inference and belief space planning ⋮ A Markovian model for the spread of the SARS-CoV-2 virus ⋮ Behavioral model summarisation for other agents under uncertainty ⋮ Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective ⋮ Off-policy evaluation in partially observed Markov decision processes under sequential ignorability ⋮ Unnamed Item ⋮ Using Machine Learning for Decreasing State Uncertainty in Planning ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Minimax real-time heuristic search ⋮ Unnamed Item ⋮ Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation ⋮ Unnamed Item ⋮ A reinforcement learning scheme for a partially-observable multi-agent game ⋮ A reinforcement learning scheme for a partially-observable multi-agent game ⋮ Representation and Timing in Theories of the Dopamine System ⋮ General Value Function Networks ⋮ A Sufficient Statistic for Influence in Structured Multiagent Environments ⋮ Constrained Multiagent Markov Decision Processes: a Taxonomy of Problems and Algorithms ⋮ Induction and Exploitation of Subgoal Automata for Reinforcement Learning ⋮ Strategy Graphs for Influence Diagrams ⋮ Task-Aware Verifiable RNN-Based Policies for Partially Observable Markov Decision Processes ⋮ Analyzing generalized planning under nondeterminism ⋮ An evidential approach to SLAM, path planning, and active exploration ⋮ Meeting a deadline: shortest paths on stochastic directed acyclic graphs with information gathering ⋮ Goal-directed learning of features and forward models ⋮ Enforcing almost-sure reachability in POMDPs ⋮ Finite-horizon LQR controller for partially-observed Boolean dynamical systems ⋮ Myopic Bounds for Optimal Policy of POMDPs: An Extension of Lovejoy’s Structural Results ⋮ Optimal speech motor control and token-to-token variability: a Bayesian modeling approach ⋮ Dynamic multiagent probabilistic inference ⋮ Autonomous agents modelling other agents: a comprehensive survey and open problems ⋮ A two-state partially observable Markov decision process with three actions ⋮ Gradient-descent for randomized controllers under partial observability ⋮ A synthesis of automated planning and reinforcement learning for efficient, robust decision-making ⋮ Learning Where to Attend with Deep Architectures for Image Tracking ⋮ Recognizing and learning models of social exchange strategies for the regulation of social interactions in open agent societies ⋮ Active inference and agency: optimal control without cost functions ⋮ Tutorial series on brain-inspired computing. IV: Reinforcement learning: machine learning and natural learning ⋮ Open problems in universal induction \& intelligence ⋮ Cost-sensitive feature acquisition and classification ⋮ Planning for multiple measurement channels in a continuous-state POMDP ⋮ Simultaneous learning and planning in a hierarchical control system for a cognitive agent ⋮ Partially observable multistage stochastic programming ⋮ Multi-stage classifier design ⋮ Learning to steer nonlinear interior-point methods ⋮ Representations for robot knowledge in the \textsc{KnowRob} framework ⋮ Robotic manipulation of multiple objects as a POMDP ⋮ Geometric backtracking for combined task and motion planning in robotic systems ⋮ Supervisor synthesis of POMDP via automata learning ⋮ Exact decomposition approaches for Markov decision processes: a survey ⋮ Testing probabilistic equivalence through reinforcement learning ⋮ BOUNDED-PARAMETER PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES: FRAMEWORK AND ALGORITHM ⋮ Planning in partially-observable switching-mode continuous domains ⋮ Optimal management of stochastic invasion in a metapopulation with Allee effects ⋮ Quantitative controller synthesis for consumption Markov decision processes ⋮ Rationalizing predictions by adversarial information calibration ⋮ Unnamed Item ⋮ Probabilistic may/must testing: retaining probabilities by restricted schedulers ⋮ Affect control processes: intelligent affective interaction using a partially observable Markov decision process ⋮ Dynamic optimization over infinite-time horizon: web-building strategy in an orb-weaving spider as a case study ⋮ Multi-goal motion planning using traveling salesman problem in belief space ⋮ Privacy stochastic games in distributed constraint reasoning ⋮ Policy iteration for bounded-parameter POMDPs ⋮ Probabilistic reasoning about epistemic action narratives ⋮ Reasoning and predicting POMDP planning complexity via covering numbers ⋮ Computing rank dependent utility in graphical models for sequential decision problems ⋮ Exploiting symmetries for single- and multi-agent partially observable stochastic domains ⋮ Decentralized MDPs with sparse interactions ⋮ State observation accuracy and finite-memory policy performance ⋮ An online multi-agent co-operative learning algorithm in POMDPs ⋮ Markov limid processes for representing and solving renewal problems ⋮ Computation of weighted sums of rewards for concurrent MDPs ⋮ Strong planning under partial observability ⋮ Solving for Best Responses and Equilibria in Extensive-Form Games with Reinforcement Learning Methods ⋮ An affective mobile robot educator with a full-time job ⋮ Markov decision processes with sequential sensor measurements ⋮ Counterexample-guided inductive synthesis for probabilistic systems ⋮ Algorithms and conditional lower bounds for planning problems ⋮ A survey of inverse reinforcement learning: challenges, methods and progress ⋮ An integrated approach to solving influence diagrams and finite-horizon partially observable decision processes ⋮ Task-structured probabilistic I/O automata ⋮ Posterior Weighted Reinforcement Learning with State Uncertainty ⋮ Integration of Reinforcement Learning and Optimal Decision-Making Theories of the Basal Ganglia ⋮ The value of information for populations in varying environments ⋮ pomdpSolve ⋮ Conformant plans and beyond: principles and complexity ⋮ Performance prediction of an unmanned airborne vehicle multi-agent system

Uses Software

Cites Work

This page was built for publication: Planning and acting in partially observable stochastic domains