A Provably Convergent Information Bottleneck Solution via ADMM
From MaRDI portal
Publication:6360194
arXiv2102.04729MaRDI QIDQ6360194
Author name not available (Why is that?)
Publication date: 9 February 2021
Abstract: The Information bottleneck (IB) method enables optimizing over the trade-off between compression of data and prediction accuracy of learned representations, and has successfully and robustly been applied to both supervised and unsupervised representation learning problems. However, IB has several limitations. First, the IB problem is hard to optimize. The IB Lagrangian is non-convex and existing solutions guarantee only local convergence. As a result, the obtained solutions depend on initialization. Second, the evaluation of a solution is also a challenging task. Conventionally, it resorts to characterizing the information plane, that is, plotting versus for all solutions obtained from different initial points. Furthermore, the IB Lagrangian has phase transitions while varying the multiplier . At phase transitions, both and increase abruptly and the rate of convergence becomes significantly slow for existing solutions. Recent works with IB adopt variational surrogate bounds to the IB Lagrangian. Although allowing efficient optimization, how close are these surrogates to the IB Lagrangian is not clear. In this work, we solve the IB Lagrangian using augmented Lagrangian methods. With augmented variables, we show that the IB objective can be solved with the alternating direction method of multipliers (ADMM). Different from prior works, we prove that the proposed algorithm is consistently convergent, regardless of the value of . Empirically, our gradient-descent-based method results in information plane points that are comparable to those obtained through the conventional Blahut-Arimoto-based solvers and is convergent for a wider range of the penalty coefficient than previous ADMM solvers.
Has companion code repository: https://github.com/hui811116/ib-admm
This page was built for publication: A Provably Convergent Information Bottleneck Solution via ADMM
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6360194)