Kernel-Based Tests for Likelihood-Free Hypothesis Testing

Gerber, Patrik Róbert; Jiang, Tianze; Polyanskiy, Yury; Sun, Rui

Statistics > Machine Learning

arXiv:2308.09043 (stat)

[Submitted on 17 Aug 2023 (v1), last revised 23 Nov 2023 (this version, v2)]

Title:Kernel-Based Tests for Likelihood-Free Hypothesis Testing

Authors:Patrik Róbert Gerber, Tianze Jiang, Yury Polyanskiy, Rui Sun

View PDF

Abstract:Given $n$ observations from two balanced classes, consider the task of labeling an additional $m$ inputs that are known to all belong to \emph{one} of the two classes. Special cases of this problem are well-known: with complete knowledge of class distributions ($n=\infty$) the problem is solved optimally by the likelihood-ratio test; when $m=1$ it corresponds to binary classification; and when $m\approx n$ it is equivalent to two-sample testing. The intermediate settings occur in the field of likelihood-free inference, where labeled samples are obtained by running forward simulations and the unlabeled sample is collected experimentally. In recent work it was discovered that there is a fundamental trade-off between $m$ and $n$: increasing the data sample $m$ reduces the amount $n$ of training/simulation data needed. In this work we (a) introduce a generalization where unlabeled samples come from a mixture of the two classes -- a case often encountered in practice; (b) study the minimax sample complexity for non-parametric classes of densities under \textit{maximum mean discrepancy} (MMD) separation; and (c) investigate the empirical performance of kernels parameterized by neural networks on two tasks: detection of the Higgs boson and detection of planted DDPM generated images amidst CIFAR-10 images. For both problems we confirm the existence of the theoretically predicted asymmetric $m$ vs $n$ trade-off.

Comments:	36 pages, 6 figures
Subjects:	Machine Learning (stat.ML); Information Theory (cs.IT); Machine Learning (cs.LG)
Cite as:	arXiv:2308.09043 [stat.ML]
	(or arXiv:2308.09043v2 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2308.09043

Submission history

From: Rui Sun [view email]
[v1] Thu, 17 Aug 2023 15:24:03 UTC (851 KB)
[v2] Thu, 23 Nov 2023 23:39:55 UTC (880 KB)

Statistics > Machine Learning

Title:Kernel-Based Tests for Likelihood-Free Hypothesis Testing

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Kernel-Based Tests for Likelihood-Free Hypothesis Testing

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators