Option-Critic in Cooperative Multi-agent Systems

From MaRDI portal
Publication:6330058

arXiv1911.12825MaRDI QIDQ6330058

Author name not available (Why is that?)

Publication date: 28 November 2019

Abstract: In this paper, we investigate learning temporal abstractions in cooperative multi-agent systems, using the options framework (Sutton et al, 1999). First, we address the planning problem for the decentralized POMDP represented by the multi-agent system, by introducing a emph{common information approach}. We use the notion of emph{common beliefs} and broadcasting to solve an equivalent centralized POMDP problem. Then, we propose the Distributed Option Critic (DOC) algorithm, which uses centralized option evaluation and decentralized intra-option improvement. We theoretically analyze the asymptotic convergence of DOC and build a new multi-agent environment to demonstrate its validity. Our experiments empirically show that DOC performs competitively against baselines and scales with the number of agents.




Has companion code repository: https://github.com/Jhelum-Ch/On-policy-Distributed-Option-Critic








This page was built for publication: Option-Critic in Cooperative Multi-agent Systems

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6330058)