The Broad Optimality of Profile Maximum Likelihood

From MaRDI portal
Publication:6320215

arXiv1906.03794MaRDI QIDQ6320215

Author name not available (Why is that?)

Publication date: 10 June 2019

Abstract: We study three fundamental statistical-learning problems: distribution estimation, property estimation, and property testing. We establish the profile maximum likelihood (PML) estimator as the first unified sample-optimal approach to a wide range of learning tasks. In particular, for every alphabet size k and desired accuracy varepsilon: extbfDistributionestimation Under ell1 distance, PML yields optimal Theta(k/(varepsilon2logk)) sample complexity for sorted-distribution estimation, and a PML-based estimator empirically outperforms the Good-Turing estimator on the actual distribution; extbfAdditivepropertyestimation For a broad class of additive properties, the PML plug-in estimator uses just four times the sample size required by the best estimator to achieve roughly twice its error, with exponentially higher confidence; For integer alpha>1, the PML plug-in estimator has optimal k11/alpha sample complexity; for non-integer alpha>3/4, the PML plug-in estimator has sample complexity lower than the state of the art; extbfIdentitytesting In testing whether an unknown distribution is equal to or at least varepsilon far from a given distribution in ell1 distance, a PML-based tester achieves the optimal sample complexity up to logarithmic factors of k. Most of these results also hold for a near-linear-time computable variant of PML. Stronger results hold for a different and novel variant called truncated PML (TPML).




Has companion code repository: https://github.com/ucsdyi/PML








This page was built for publication: The Broad Optimality of Profile Maximum Likelihood

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6320215)