Simpler, Faster, Stronger: Breaking The log-K Curse On Contrastive Learners With FlatNCE
From MaRDI portal
Publication:6371843
arXiv2107.01152MaRDI QIDQ6371843
Author name not available (Why is that?)
Publication date: 2 July 2021
Abstract: InfoNCE-based contrastive representation learners, such as SimCLR, have been tremendously successful in recent years. However, these contrastive schemes are notoriously resource demanding, as their effectiveness breaks down with small-batch training (i.e., the log-K curse, whereas K is the batch-size). In this work, we reveal mathematically why contrastive learners fail in the small-batch-size regime, and present a novel simple, non-trivial contrastive objective named FlatNCE, which fixes this issue. Unlike InfoNCE, our FlatNCE no longer explicitly appeals to a discriminative classification goal for contrastive learning. Theoretically, we show FlatNCE is the mathematical dual formulation of InfoNCE, thus bridging the classical literature on energy modeling; and empirically, we demonstrate that, with minimal modification of code, FlatNCE enables immediate performance boost independent of the subject-matter engineering efforts. The significance of this work is furthered by the powerful generalization of contrastive learning techniques, and the introduction of new tools to monitor and diagnose contrastive training. We substantiate our claims with empirical evidence on CIFAR10, ImageNet, and other datasets, where FlatNCE consistently outperforms InfoNCE.
Has companion code repository: https://github.com/Junya-Chen/FlatCLR
This page was built for publication: Simpler, Faster, Stronger: Breaking The log-K Curse On Contrastive Learners With FlatNCE
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6371843)