Quantifying & Modeling Multimodal Interactions: An Information Decomposition Framework
From MaRDI portal
Publication:6427456
arXiv2302.12247MaRDI QIDQ6427456
Author name not available (Why is that?)
Publication date: 23 February 2023
Abstract: The recent explosion of interest in multimodal applications has resulted in a wide selection of datasets and methods for representing and integrating information from different signals. Despite these empirical advances, there remain fundamental research questions: how can we quantify the nature of interactions that exist among input features? Subsequently, how can we capture these interactions using suitable data-driven methods? To answer this question, we propose an information-theoretic approach to quantify the degree of redundancy, uniqueness, and synergy across input features, which we term the PID statistics of a multimodal distribution. Using 2 newly proposed estimators that scale to high-dimensional distributions, we demonstrate their usefulness in quantifying the interactions within multimodal datasets, the nature of interactions captured by multimodal models, and principled approaches for model selection. We conduct extensive experiments on both synthetic datasets where the PID statistics are known and on large-scale multimodal benchmarks where PID estimation was previously impossible. Finally, to demonstrate the real-world applicability of our approach, we present three case studies in pathology, mood prediction, and robotic perception where our framework accurately recommends strong multimodal models for each application.
Has companion code repository: https://github.com/pliang279/pid
This page was built for publication: Quantifying & Modeling Multimodal Interactions: An Information Decomposition Framework
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6427456)