This package implements a robust hypothesis testing procedure: the Lq-likelihood-ratio-type test (LqRT), introduced in Qin and Priebe (2017). The code replicates and extends the R package which can be found here. The paper supporting this package is currently in review; a preprint can be found here.
In order to install the package one needs to have python 3.x installed, then either clone the repository from github and install using pip:
git clone https://github.com/alyakin314/lqrt
cd lqrt
pip install .
on install directly via pip:
pip install lqrt
The recommended import line is
>>> import lqrt
All examples below are performed using this import line.
(All examples also assume the usage of numpy and scipy.stats imported as:
>>> import numpy as np
>>> from scipy import stats
This is unnecessary for the actual usage of the package, only for examples)
There are three tests implemented in this repostory: one sample, related samples (also known as paired) and independent samples (also known as unpaired).
The one sample test performs the Lq-Likelihood-test for the mean of one group
of scores.
It is a robust two-sided test for the null hypothesis that the expected
value (mean) of a sample of independent observations is equal to the given
population mean.
It can be thought of the as a robust version of the single sample t-test
(scipy.stats.ttest_1samp
).
Test if mean of random sample is equal to true mean, and different mean. We reject the null hypothesis in the second case and don’t reject it in the first case.
>>> np.random.seed(314)
>>> rvs1 = stats.multivariate_normal.rvs(0, 1, 50)
>>> lqrt.lqrtest_1samp(rvs1, 0)
Lqrtest_1sampResult(statistic=0.02388120731922072, pvalue=0.85)
>>> lqrt.lqrtest_1samp(rvs1, 1)
Lqrtest_1sampResult(statistic=35.13171144154751, pvalue=0.0)
The related samples test performs the Lq-Likelihood-test for the mean of
two related samples of scores.
It is a robust two-sided test for the null hypothesis that 2 independent
samples have identical average (expected) values.
It can be thought of as a robust version of the paired t-test
(scipy.stats.ttest_rel
).
A related samples test between two samples with identical means:
>>> np.random.seed(314)
>>> rvs1 = stats.multivariate_normal.rvs(0, 1, 50)
>>> rvs2 = stats.multivariate_normal.rvs(0, 1, 50)
>>> lqrt.lqrtest_rel(rvs1, rvs2)
Lqrtest_relResult(statistic=0.22769245832813567, pvalue=0.66)
A related samples test between two samples with different means:
>>> rvs3 = stats.multivariate_normal.rvs(1, 1, 50)
>>> lqrt.lqrtest_rel(rvs1, rvs3)
Lqrtest_relResult(statistic=27.827284933987784, pvalue=0.0)
The independent samples tests performes the Lq-Likelihood-test for the mean
of two independent samples of scores.
It is a robust two-sided test for the null hypothesis that 2 independent
samples have identical average (expected) values.
it can be thought of as a robust version of the unparied t-test
(scipy.stats.ttest_ind
).
One can perform the test with or without the assumption that the samples have
equal variance, which is estimated together.
This is accomplished by setting the equal_variance flag, similar to scipy's
t-test.
Test with samples with identical means with and without the equal variance assumption. Note that in the unpaired set-up the samples need not to have the same size:
>>> np.random.seed(314)
>>> rvs1 = stats.multivariate_normal.rvs(0, 1, 50)
>>> rvs2 = stats.multivariate_normal.rvs(0, 1, 70)
>>> lqrt.lqrtest_ind(rvs1, rvs2)
LqRtest_indResult(statistic=0.00046542438241203854, pvalue=0.99)
>>> lqrt.lqrtest_ind(rvs1, rvs2, equal_var=False)
LqRtest_indResult(statistic=0.00047040017227573117, pvalue=0.97)
Test with samples with different means with and without the equal variance assumption:
>>> rvs3 = stats.multivariate_normal.rvs(1, 1, 70)
>>> lqrt.lqrtest_ind(rvs1, rvs3)
LqRtest_indResult(statistic=31.09168298440227, pvalue=0.0)
>>> lqrt.lqrtest_ind(rvs1, rvs3, equal_var=False)
LLqRtest_indResult(statistic=31.251454446588696, pvalue=0.0)
All test functions have an argument q which specifies the q parameter of the Lq-likelihood. The q should typically be within [0.5, 1.0] and the lower value is associated with a more robust test. If left unspecified of set to None, the q is estimated using the empirical approximation to the trace of the assymptotic covariance matrix procedure specified in Qin and Priebe (2017).
>>> x_true = np.random.normal(0.34, 1, 40)
>>> x_contamination = np.random.normal(0.34, 1, 10)
>>> x_sample = np.concatenate([x_true, x_contamination])
>>> lqrt.lqrtest_1samp(x_sample, 0, q = 0.9)
Lqrtest_1sampResult(statistic=1.1440379636073885, pvalue=0.28)
>>> lqrt.lqrtest_1samp(x_sample, 0, q = 0.6)
Lqrtest_1sampResult(statistic=3.710699836358458, pvalue=0.08)
>>> lqrt.lqrtest_1samp(x_sample, 0)
Lqrtest_1sampResult(statistic=5.5937088664291394, pvalue=0.06)
The p-value for the tests is obtained via a bootstrap procedure, outlined in Qin and Priebe (2017). By default - 100 resamples are used, but the number can be changed. Increasing the number of samples increases the precision of the p-value, but adds on computational work.
>>> x = np.random.normal(0, 1, 50)
>>> lqrt.lqrtest_1samp(x, 0) # takes ~0.25s
Lqrtest_1sampResult(statistic=0.36665186821102225, pvalue=0.58)
>>> lqrt.lqrtest_1samp(x, 0, bootstrap=1000) # takes ~1.5s
Lqrtest_1sampResult(statistic=0.36665186821102225, pvalue=0.541)
>>> lqrt.lqrtest_1samp(x, 0, bootstrap=10000) # takes ~15s
Lqrtest_1sampResult(statistic=0.36665186821102225, pvalue=0.5483)
Qin, Y., & Priebe, C. E. (2017). Robust Hypothesis Testing via Lq-Likelihood. Statistica Sinica. 27. 10.5705/ss.202015.0441.
Alyakin, A., Qin, Y., & Priebe, C. E. (2019). LqRT: Robust Hypothesis Testing of Location Parameters using Lq-Likelihood-Ratio-Type Test in Python. arXiv preprint arXiv:1911.11922.