Type: | Package |
Title: | Binary Expansion Testing |
Version: | 0.5.4 |
Depends: | R (≥ 3.5.0) |
Description: | Nonparametric detection of nonuniformity and dependence with Binary Expansion Testing (BET). See Kai Zhang (2019) BET on Independence, Journal of the American Statistical Association, 114:528, 1620-1637, <doi:10.1080/01621459.2018.1537921>, Kai Zhang, Wan Zhang, Zhigen Zhao, Wen Zhou. (2023). BEAUTY Powered BEAST, <doi:10.48550/arXiv.2103.00674> and Wan Zhang, Zhigen Zhao, Michael Baiocchi, Yao Li, Kai Zhang. (2023) SorBET: A Fast and Powerful Algorithm to Test Dependence of Variables, Techinical report. |
License: | GPL-2 | GPL-3 [expanded from: GPL] |
Encoding: | UTF-8 |
LazyData: | true |
Imports: | Rcpp (≥ 0.12.3) |
LinkingTo: | Rcpp |
RoxygenNote: | 7.1.1 |
NeedsCompilation: | yes |
Packaged: | 2024-09-09 02:34:45 UTC; zhangwan |
Author: | Wan Zhang [aut, cre], Zhigen Zhao [aut], Michael Baiocchi [aut], Kai Zhang [aut] |
Maintainer: | Wan Zhang <wanz63@live.unc.edu> |
Repository: | CRAN |
Date/Publication: | 2024-09-09 03:40:01 UTC |
Binary Expansion Testing
Description
The BET
package provides functions for nonparametric detection of nonuniformity and dependence with Binary Expansion Testing (BET).
BET functions
MaxBET
symm
get.signs
cell.counts
bet.plot
MaxBETs
BEAST
Reference(s)
Kai Zhang (2019) BET on Independence, Journal of the American Statistical Association, 114:528, 1620-1637, doi:10.1080/01621459.2018.1537921, Kai Zhang, Zhigen Zhao, and Wen Zhou (2021). BEAUTY Powered BEAST, <arXiv:2103.00674> and Wan Zhang, Zhigen Zhao, Michael Baiocchi, Yao Li, Kai Zhang. SorBET: A Fast and Powerful Algorithm to Test Dependence of Variables. Techinical report, 2023.
Binary Expansion Adaptive Symmetry Test
Description
BEAST
(Binary Expansion Adaptive Symmetry Test) is used for nonparametric detection of nonuniformity or dependence.
Usage
BEAST(
X,
dep,
subsample.percent = 1/2,
B = 100,
unif.margin = FALSE,
lambda = NULL,
index = list(c(1:ncol(X))),
method = "p",
num = NULL
)
Arguments
X |
a matrix to be tested. |
dep |
depth of the binary expansion for the |
subsample.percent |
sample size for subsampling. |
B |
times of subsampling. |
unif.margin |
logicals. If |
lambda |
tuning parameter for soft-thresholding, default to be |
index |
a list of indices. If provided, test the independence among two or more groups of variables. For example, |
method |
If |
num |
number of permutations if method == "p" (default to be 100), or simulations if method == "s" (default to be 1000). |
Value
Interaction |
the most frequent interaction among all subsamples. |
BEAST.Statistic |
BEAST statistic. |
Null.Distribution |
simulated null distribution. |
p.value |
simulated p-value. |
Examples
## Elapsed times 7.32 secs
## Measured in R 4.0.2, 32 bit, on a processor 3.3 GHz 6-Core Intel Core i5 under MacOS, 2024/9/6
## Not run:
x1 = runif(128)
x2 = runif(128)
y = sin(4*pi*(x1 + x2)) + 0.8*rnorm(128)
##test independence between (x1, x2) and y
BEAST(cbind(x1, x2, y), 3, index = list(c(1,2), c(3)))
##test mutual independence among x1, x2 and y
BEAST(cbind(x1, x2, y), 3, index = list(1, 2, 3))
##test bivariate uniformity
x1 = rbeta(128, 2, 4)
x2 = rbeta(128, 2, 4)
BEAST(cbind(x1, x2), 3)
##test multivariate uniformity
x1 = rbeta(128, 2, 4)
x2 = rbeta(128, 2, 4)
x3 = rbeta(128, 2, 4)
BEAST(cbind(x1, x2, x3), 3)
## End(Not run)
Binary Expansion Testing at a Certain Depth
Description
MaxBET
stands for Binary Expansion Testing. It is used for nonparametric detection of nonuniformity or dependence. It can be used to test whether a column vector is [0, 1]-uniformly distributed. It can also be used to detect dependence between columns of a matrix X
, if X
has more than one column.
Usage
MaxBET(
X,
dep,
unif.margin = FALSE,
asymptotic = TRUE,
plot = FALSE,
index = list(c(1:ncol(X)))
)
Arguments
X |
a matrix to be tested. When |
dep |
depth of the binary expansion for the |
unif.margin |
logicals. If |
asymptotic |
logicals. If |
plot |
logicals. If |
index |
a list of indices. If provided, test the independence among two or more groups of variables. For example, |
Details
MaxBET
tests the independence or uniformity by considering the maximal magnitude of the symmetry statistics in the sigma
-field generated from marginal binary expansions at the depth d
.
Value
Interaction |
a dataframe with |
Extreme.Asymmetry |
the extreme asymmetry statistics. |
p.value.bonf |
p-value of the test with Bonferroni adjustment. |
z.statistic |
normal approximation of the test statistic. |
Examples
##test mutual independence
v <- runif(128, -pi, pi)
X1 <- cos(v) + 2.5 * rnorm(128, 0, 1/20)
X2 <- sin(v) + 2.5 * rnorm(128, 0, 1/20)
MaxBET(cbind(X1, X2), 3, asymptotic = FALSE, index = list(1,2))
##test independence between (x1, x2) and y
x1 = runif(128)
x2 = runif(128)
y = sin(4*pi*(x1 + x2)) + 0.4*rnorm(128)
MaxBET(cbind(x1, x2, y), 3, index = list(c(1,2), c(3)))
##test uniformity
x1 = rbeta(128, 2, 4)
x2 = rbeta(128, 2, 4)
x3 = rbeta(128, 2, 4)
MaxBET(cbind(x1, x2, x3), 3)
Binary Expansion Testing up to a Certain Depth
Description
MaxBETs
is used for nonparametric dependence detection.
Extended from BET
, for a chosen maximal depth d.max
, MaxBETs
does a sequential test up to d.max
and avoids overlapping symmetry statistics in different depths,
for all 2 \le d \le d.max
. The adjustment is done by multiplying the number of interactions which are in the \sigma
-field generated by marginal binary expansions at depth d
but not in that at depth d-1
.
Usage
MaxBETs(
X,
d.max = 4,
unif.margin = FALSE,
asymptotic = TRUE,
plot = FALSE,
index = list(c(1:ncol(X)))
)
Arguments
X |
a matrix to be tested. When |
d.max |
the maximal depth of the binary expansion for |
unif.margin |
logicals. If |
asymptotic |
logicals. If |
plot |
logicals. If |
index |
a list of indices. If provided, test the independence among two or more groups of variables, for example, |
Value
bet.s.pvalue.bonf |
the overall p-value on the test. |
bet.s.index |
the interaction that the p-value is minimal. |
bet.s.zstatistic |
normal approximation of the test statistic. |
Examples
##test mutual independence
v <- runif(128, -pi, pi)
X1 <- cos(v) + 2.5 * rnorm(128, 0, 1/20)
X2 <- sin(v) + 2.5 * rnorm(128, 0, 1/20)
MaxBETs(cbind(X1, X2), 3, asymptotic = FALSE, index = list(1,2))
##test independence between (x1, x2) and y
x1 = runif(128)
x2 = runif(128)
y = sin(4*pi*(x1 + x2)) + 0.4*rnorm(128)
MaxBETs(cbind(x1, x2, y), 3, index = list(c(1,2), c(3)))
##test uniformity
x1 = rbeta(128, 2, 4)
x2 = rbeta(128, 2, 4)
x3 = rbeta(128, 2, 4)
MaxBETs(cbind(x1, x2, x3), 3)
Plotting Binary Expansion Testing (2-dimensions)
Description
bet.plot
shows the cross interaction of the strongest asymmetry, which the BET returns with the rejection of independence null.
This function only works for the test on two variables, that is, X
can only have two columns.
There are 2^{2dep} - 1
nontrivial binary variables in the \sigma
-field and (2^dep - 1)^2
of them are cross interactions, whose positive regions are in plotted in white and whose negative regions are plotted in blue.
plot.bet
shows the cross interaction where the difference of number of observations in the positive and negative region is largest.
Usage
## S3 method for class 'plot'
bet(X, dep, unif.margin = FALSE, cex=0.5, index = list(c(1:ncol(X))), ...)
Arguments
X |
a matrix with two columns. |
dep |
depth of BET. |
unif.margin |
logicals. If |
cex |
number indicating the amount by which plotting text and symbols should be scaled relative to the default. |
index |
a list of indices. If provided, test the independence among two or more groups of variables. For example, |
... |
graphical parameters to plot |
Examples
v <- runif(128, -pi, pi)
X1 <- cos(v) + 2.5 * rnorm(128, 0, 1/20)
X2 <- sin(v) + 2.5 * rnorm(128, 0, 1/20)
bet.plot(cbind(X1, X2), 3, index = list(1,2))
Counts the amount of points in each cell after binary expansion.
Description
cell.counts
returns the amount of data points in each cell getting from binary expansion.
Usage
cell.counts(X, dep, unif.margin = FALSE)
Arguments
X |
a matrix to be tested. |
dep |
depth of the marginal binary expansions. |
unif.margin |
logicals. If |
Value
The result is a dataframe with 2 rows and 2^(p*dep)
columns, where p
is the number of columns of X
. The first column is the binary index, the second column is the amount of data points.
Examples
v <- runif(128, -pi, pi)
X1 <- cos(v) + 2.5 * rnorm(128, 0, 1/20)
X2 <- sin(v) + 2.5 * rnorm(128, 0, 1/20)
cell.counts(cbind(X1, X2), 3)
Signs of Colors of all Points for all Interactions
Description
get.signs
returns all the signs of colors for each point under all interactions up to depth d
in marginal binary expansions for the tests BET
and BETs
.
Usage
get.signs(X, dep, unif.margin = FALSE)
Arguments
X |
a matrix to be tested. |
dep |
depth of the marginal binary expansions. |
unif.margin |
logicals. If |
Value
The result is a dataframe with n
rows and 2^(p*dep)
columns, where p
is the number of columns of X
and n
is the number of rows of X
. The values of 1
or -1
stand for the sign of color, while the marginal interactions return 0
.
Examples
v <- runif(128, -pi, pi)
X1 <- cos(v) + 2.5 * rnorm(128, 0, 1/20)
X2 <- sin(v) + 2.5 * rnorm(128, 0, 1/20)
get.signs(cbind(X1, X2), 3)
Coordinates of Brightest Stars in the Night Sky
Description
This data set collects the galactic coordinates of the 256 brightest stars in the night sky (Perryman et al. 1997). We consider the longitude (x
) and sine latitude (y
) here.
Usage
data(star)
Format
An object of class data.frame
with 256 rows and 2 columns.
Examples
data(star)
MaxBETs(cbind(star$x.raw, star$y.raw), asymptotic = FALSE, plot = TRUE, index = list(1,2))
Symmetry Statistics for all Interactions
Description
symm
returns all the symmetry statistics up to depth d
in marginal binary expansions for the tests BET
and BETs
.
Usage
symm(
X,
dep,
unif.margin = FALSE,
print.sample.size = TRUE
)
Arguments
X |
a matrix to be tested. |
dep |
depth of the marginal binary expansions. |
unif.margin |
logicals. If |
print.sample.size |
logicals. If |
Value
The result is a dataframe with (p+2)
columns, where p
is the number of columns of X
. The first column gives the binary index for all variables, the next p
columns displays all the interactions of respective variables, the last column of Statistics
gives the respective symmetry statistic.
Examples
v <- runif(128, -pi, pi)
X1 <- cos(v) + 2.5 * rnorm(128, 0, 1/20)
X2 <- sin(v) + 2.5 * rnorm(128, 0, 1/20)
symm(cbind(X1, X2), 3)