Kernel Thinning

From MaRDI portal
Publication:6367524

arXiv2105.05842MaRDI QIDQ6367524

Author name not available (Why is that?)

Publication date: 12 May 2021

Abstract: We introduce kernel thinning, a new procedure for compressing a distribution mathbbP more effectively than i.i.d. sampling or standard thinning. Given a suitable reproducing kernel mathbfkstar and mathcalO(n2) time, kernel thinning compresses an n-point approximation to mathbbP into a sqrtn-point approximation with comparable worst-case integration error across the associated reproducing kernel Hilbert space. The maximum discrepancy in integration error is mathcalOd(n1/2sqrtlogn) in probability for compactly supported mathbbP and mathcalOd(nfrac12(logn)(d+1)/2sqrtloglogn) for sub-exponential mathbbP on mathbbRd. In contrast, an equal-sized i.i.d. sample from mathbbP suffers Omega(n1/4) integration error. Our sub-exponential guarantees resemble the classical quasi-Monte Carlo error rates for uniform mathbbP on [0,1]d but apply to general distributions on mathbbRd and a wide range of common kernels. Moreover, the same construction delivers near-optimal Linfty coresets in mathcalO(n2) time. We use our results to derive explicit non-asymptotic maximum mean discrepancy bounds for Gaussian, Mat'ern, and B-spline kernels and present two vignettes illustrating the practical benefits of kernel thinning over i.i.d. sampling and standard Markov chain Monte Carlo thinning, in dimensions d=2 through 100.




Has companion code repository: https://github.com/microsoft/goodpoints








This page was built for publication: Kernel Thinning

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6367524)