AdapTesting: A Data-adaptive Hypothesis Testing Toolbox

Introduction

Hypothesis testing serves as a fundamental statistical tool in machine learning (two-sample testing, independence testing, etc.). Despite its importance, current implementations face significant challenges: fragmented code across different languages and frameworks, lack of unified standards, and complex integration processes requiring users to download, understand, and modify multiple source code repositories.

To address these challenges, we present AdapTesting, a comprehensive toolbox that unifies state-of-the-art hypothesis testing methods for machine learning applications. Our toolbox simplifies the testing process to its essence: users only need to provide their data and optionally specify the desired testing method name (as we provide the default method for both time-efficiency and power-generability) to receive comprehensive results, including result of testing, p-values, and test statistics. By standardizing implementation in Python and PyTorch frameworks, we ensure GPU-accelerated computations across different computing systems while maintaining a consistent interface.

Our initial release focuses on implementing comprehensive two-sample testing methods, with planned extensions to include independence testing and other frequently used hypothesis tests in subsequent releases. Through AdapTesting, we aim to democratize statistical testing in machine learning by providing a unified, efficient, and accessible framework that bridges the gap between theoretical methods and practical applications.

Methods for TST - referenced paper

Median Heuristics - Large Sample Analysis of the Median Heuristic
MMD-FUSE - MMD-FUSE: Learning and Combining Kernels for Two-Sample Testing Without Data Splitting
MMD-Agg - MMD Aggregated Two-Sample Test
MMD-Deep - Learning Deep Kernels for Non-Parametric Two-Sample Tests
C2ST-MMD - Revisiting Classifier Two-Sample Tests

Installation and Usage

Installation

You can install using pip and access it locally:

pip install git+https://github.com/yeager20001118/AdapTesting

or install it from PyPI after we release and publish it:

pip install adaptesting

Example usage of Two-sample Testing

The detailed demo examples (for tabular, image and text data) can be found in the examples directory.

from adaptesting import tst # Import main function 'tst' from package 'adaptesting'

# Example synthetic input data
import torch
from torch.distributions import MultivariateNormal
mean = torch.tensor([0.5, 0.5])
cov1, cov2 = torch.tensor([[1.0, 0.5], [0.5, 1.0]]), torch.tensor([[1.0, 0], [0, 1.0]])
mvn1, mvn2 = MultivariateNormal(mean, cov1), MultivariateNormal(mean, cov2)

# Replace X, Y to your own data, make sure its type as torch.Tensor
X, Y = mvn1.sample((1000,)), mvn2.sample((1000,)) 

# Five kinds of SOTA TST methods to choose：
h, mmd_value, p_value = tst(X, Y, device="cuda") # Default method using median heuristic

# Other available methods and their default arguments setting (uncomment to use):
# h, mmd_value, p_value = tst(X, Y, device="cuda", method="fuse", kernel="laplace_gaussian", n_perm=2000)
# h, mmd_value, p_value = tst(X, Y, device="cuda", method="agg", n_perm=3000)
# h, mmd_value, p_value = tst(X, Y, device="cuda", method="clf", data_type="tabular", patience=150, n_perm=200)
# h, mmd_value, p_value = tst(X, Y, device="cuda", method="deep", data_type="tabular", patience=150, n_perm=200)

"""
Output of tst: 
    (result of testing: 0 or 1, 
    mmd value of two samples, 
    p-value of testing)

If testing the two samples are from different distribution, the console will output 
    'Reject the null hypothesis with p-value: ..., the MMD value is ...'.
Otherwise,
    'Fail to reject the null hypothesis with p-value: ..., the MMD value is ...'.
"""

Performance of TST methods

Performance evaluations and benchmarks across tabular, image, and text data can be found in the examples directory.

Test Power: Ability to correctly reject H0 when false (higher is better)
Type-I Error: Rate of falsely rejecting H0 when true (should be $\leq \alpha$, the significance level)
Running Time: Computational time in seconds

Contributors

This work is done by

Xunye Tian (UOM), [email protected]
Zhijian Zhou (UOM), [email protected]
Dr. Feng Liu (UOM), [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
build/lib/adaptesting		build/lib/adaptesting
dist		dist
examples		examples
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
command.md		command.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AdapTesting: A Data-adaptive Hypothesis Testing Toolbox

Introduction

Methods for TST - referenced paper

Installation and Usage

Installation

Example usage of Two-sample Testing

Performance of TST methods

Contributors

About

Releases

Packages

Contributors 2

Languages

License

yeager20001118/AdapTesting

Folders and files

Latest commit

History

Repository files navigation

AdapTesting: A Data-adaptive Hypothesis Testing Toolbox

Introduction

Methods for TST - referenced paper

Installation and Usage

Installation

Example usage of Two-sample Testing

Performance of TST methods

Contributors

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages