Spectral Regularized Kernel Two-Sample Tests
From MaRDI portal
Publication:6420913
arXiv2212.09201MaRDI QIDQ6420913
Author name not available (Why is that?)
Publication date: 18 December 2022
Abstract: Over the last decade, an approach that has gained a lot of popularity to tackle non-parametric testing problems on general (i.e., non-Euclidean) domains is based on the notion of reproducing kernel Hilbert space (RKHS) embedding of probability distributions. The main goal of our work is to understand the optimality of two-sample tests constructed based on this approach. First, we show that the popular MMD (maximum mean discrepancy) two-sample test is not optimal in terms of the separation boundary measured in Hellinger distance. Second, we propose a modification to the MMD test based on spectral regularization by taking into account the covariance information (which is not captured by the MMD test) and prove the proposed test to be minimax optimal with a smaller separation boundary than that achieved by the MMD test. Third, we propose an adaptive version of the above test which involves a data-driven strategy to choose the regularization parameter and show the adaptive test to be almost minimax optimal up to a logarithmic factor. Moreover, our results hold for the permutation variant of the test where the test threshold is chosen elegantly through the permutation of the samples. Through numerical experiments on synthetic and real-world data, we demonstrate the superior performance of the proposed test in comparison to the MMD test.
Has companion code repository: https://github.com/OmarHagrass/Spectral-regularized-two-sample-test
No records found.
This page was built for publication: Spectral Regularized Kernel Two-Sample Tests
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6420913)