| Type: | Package | 
| Title: | Scalable Spike-and-Slab | 
| Version: | 1.0 | 
| Date: | 2022-05-13 | 
| Description: | A scalable Gibbs sampling implementation for high dimensional Bayesian regression with the continuous spike-and-slab prior. Niloy Biswas, Lester Mackey and Xiao-Li Meng, "Scalable Spike-and-Slab" (2022) <doi:10.48550/arXiv.2204.01668>. | 
| License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] | 
| Imports: | Rcpp, stats, TruncatedNormal | 
| LinkingTo: | Rcpp, RcppEigen | 
| RoxygenNote: | 7.1.2 | 
| NeedsCompilation: | yes | 
| Packaged: | 2022-05-17 20:40:23 UTC; niloybiswas | 
| Author: | Niloy Biswas  | 
| Maintainer: | Niloy Biswas <niloy_biswas@g.harvard.edu> | 
| Depends: | R (≥ 3.5.0) | 
| Repository: | CRAN | 
| Date/Publication: | 2022-05-18 17:00:07 UTC | 
Riboflavin GWAS dataset
Description
Dataset of riboflavin production by Bacillus subtilis containing n = 71 observations of a one-dimensional response (riboflavin production) and p = 4088 predictors (gene expressions). The one-dimensional response corresponds to riboflavin production.
Usage
data(riboflavin)
Format
A data frame containing a vector y of length 71 (responses) and a matrix X of dimension 71 by 4088 (gene expressions)
Details
The processed dataset is the same as in the R packages qut and hdi.
References
Buhlmann, P., Kalisch, M. and Meier, L. (2014) High-dimensional statistics with a view towards applications in biology. Annual Review of Statistics and its Applications 1, 255–278
Examples
data(riboflavin)
y <- as.vector(riboflavin$y)
X <- as.matrix(riboflavin$x)
spike_slab_linear
Description
Generates Markov chain targeting the posterior corresponding to Bayesian linear regression with spike and slab priors
Usage
spike_slab_linear(
  chain_length,
  X,
  y,
  tau0,
  tau1,
  q,
  a0 = 1,
  b0 = 1,
  rinit = NULL,
  verbose = FALSE,
  burnin = 0,
  store = TRUE,
  Xt = NULL,
  XXt = NULL,
  tau0_inverse = NULL,
  tau1_inverse = NULL
)
Arguments
chain_length | 
 Markov chain length  | 
X | 
 matrix of length n by p  | 
y | 
 Response  | 
tau0 | 
 prior hyperparameter (non-negative real)  | 
tau1 | 
 prior hyperparameter (non-negative real)  | 
q | 
 prior hyperparameter (strictly between 0 and 1)  | 
a0 | 
 prior hyperparameter (non-negative real)  | 
b0 | 
 prior hyperparameter (non-negative real)  | 
rinit | 
 initial distribution of Markov chain (default samples from the prior)  | 
verbose | 
 print iteration of the Markov chain (boolean)  | 
burnin | 
 chain burnin (non-negative integer)  | 
store | 
 store chain trajectory (boolean)  | 
Xt | 
 Pre-calculated transpose of X  | 
XXt | 
 Pre-calculated matrix X*transpose(X) (n by n matrix)  | 
tau0_inverse | 
 Pre-calculated matrix inverse(I + tau0^2*XXt) (n by n matrix)  | 
tau1_inverse | 
 Pre-calculated matrix inverse(I + tau1^2*XXt) (n by n matrix)  | 
Value
Output from Markov chain targeting the posterior corresponding to Bayesian linear regression with spike and slab priors
Examples
# Synthetic dataset
syn_data <- synthetic_data(n=100,p=200,s0=5,error_std=2,type='linear')
X <- syn_data$X
y <- syn_data$y
# Hyperparamters
params <- spike_slab_params(n=nrow(X),p=ncol(X))
# Run S^3
sss_chain <- spike_slab_linear(chain_length=4e3,burnin=1e3,X=X,y=y,
tau0=params$tau0,tau1=params$tau1,q=params$q,a0=params$a0,b0=params$b0,
verbose=FALSE,store=FALSE)
# Use posterior probabilities for variable selection
sss_chain$z_ergodic_avg[1:10]
spike_slab_logistic
Description
Generates Markov chain targeting the posterior corresponding to Bayesian logistic regression with spike and slab priors
Usage
spike_slab_logistic(
  chain_length,
  X,
  y,
  tau0,
  tau1,
  q,
  rinit = NULL,
  verbose = FALSE,
  burnin = 0,
  store = TRUE,
  Xt = NULL,
  XXt = NULL
)
Arguments
chain_length | 
 Markov chain length  | 
X | 
 matrix of length n by p  | 
y | 
 Response  | 
tau0 | 
 prior hyperparameter (non-negative real)  | 
tau1 | 
 prior hyperparameter (non-negative real)  | 
q | 
 prior hyperparameter (strictly between 0 and 1)  | 
rinit | 
 initial distribution of Markov chain (default samples from the prior)  | 
verbose | 
 print iteration of the Markov chain (boolean)  | 
burnin | 
 chain burnin (non-negative integer)  | 
store | 
 store chain trajectory (boolean)  | 
Xt | 
 Pre-calculated transpose of X  | 
XXt | 
 Pre-calculated matrix X*transpose(X) (n by n matrix)  | 
Value
Output from Markov chain targeting the posterior corresponding to Bayesian logistic regression with spike and slab priors
Examples
# Synthetic dataset
syn_data <- synthetic_data(n=100,p=200,s0=5,error_std=2,type='logistic')
X <- syn_data$X
y <- syn_data$y
# Hyperparamters
params <- spike_slab_params(n=nrow(X),p=ncol(X))
# Run S^3
sss_chain <- spike_slab_logistic(chain_length=4e3,burnin=1e3,X=X,y=y,
tau0=params$tau0,tau1=params$tau1,q=params$q,verbose=FALSE,store=FALSE)
# Use posterior probabilities for variable selection
sss_chain$z_ergodic_avg[1:10]
spike_slab_params
Description
Generates hyperparameters for spike-and-slab
Usage
spike_slab_params(n, p)
Arguments
n | 
 number of observations  | 
p | 
 number of covariates  | 
Value
spike-and-slab hyperparameters q, tau0, tau1, a0, b0
Examples
hyper_params <- spike_slab_params(n=100,p=200)
print(hyper_params)
spike_slab_probit
Description
Generates Markov chain targeting the posterior corresponding to Bayesian probit regression with spike and slab priors
Usage
spike_slab_probit(
  chain_length,
  X,
  y,
  tau0,
  tau1,
  q,
  rinit = NULL,
  verbose = FALSE,
  burnin = 0,
  store = TRUE,
  Xt = NULL,
  XXt = NULL,
  tau0_inverse = NULL,
  tau1_inverse = NULL
)
Arguments
chain_length | 
 Markov chain length  | 
X | 
 matrix of length n by p  | 
y | 
 Response  | 
tau0 | 
 prior hyperparameter (non-negative real)  | 
tau1 | 
 prior hyperparameter (non-negative real)  | 
q | 
 prior hyperparameter (strictly between 0 and 1)  | 
rinit | 
 initial distribution of Markov chain (default samples from the prior)  | 
verbose | 
 print iteration of the Markov chain (boolean)  | 
burnin | 
 chain burnin (non-negative integer)  | 
store | 
 store chain trajectory (boolean)  | 
Xt | 
 Pre-calculated transpose of X  | 
XXt | 
 Pre-calculated matrix X*transpose(X) (n by n matrix)  | 
tau0_inverse | 
 Pre-calculated matrix inverse(I + tau0^2*XXt) (n by n matrix)  | 
tau1_inverse | 
 Pre-calculated matrix inverse(I + tau1^2*XXt) (n by n matrix)  | 
Value
Output from Markov chain targeting the posterior corresponding to Bayesian logistic regression with spike and slab priors
Examples
# Synthetic dataset
syn_data <- synthetic_data(n=100,p=200,s0=5,error_std=2,type='probit')
X <- syn_data$X
Xt <- t(X)
y <- syn_data$y
# Hyperparamters
params <- spike_slab_params(n=nrow(X),p=ncol(X))
# Run S^3
sss_chain <- spike_slab_probit(chain_length=4e3,burnin=1e3,X=X,y=y,
tau0=params$tau0,tau1=params$tau1,q=params$q,verbose=FALSE,store=FALSE)
# Use posterior probabilities for variable selection
sss_chain$z_ergodic_avg[1:10]
synthetic_data
Description
Generates synthetic linear and logistic regression data
Usage
synthetic_data(
  n,
  p,
  s0,
  error_std,
  type = "linear",
  scale = TRUE,
  signal = "constant"
)
Arguments
n | 
 number of observations  | 
p | 
 number of covariates  | 
s0 | 
 sparsity (number of non-zero components of the true signal)  | 
error_std | 
 Standard deviation of the Gaussian noise (linear regression only)  | 
type | 
 dataset type ('linear' or 'logistic')  | 
scale | 
 design matrix X has columns mean zero and standard deviation 1 (TRUE or FALSE)  | 
signal | 
 non-zero components of the true signal ('constant' or 'deacy')  | 
Value
Design matrix, response and true signal vector for linear and logistic regression
Examples
syn_data <- synthetic_data(n=100,p=200,s0=5,error_std=2)
# syn_data$X is an n by p design matrix
dim(syn_data$X)
# syn_data$y is a length n response vector
length(syn_data$y) 
# syn_data$true_beta is a length n response vector with only the first s0 entries non-zero
all(syn_data$true_beta[1:5]!=0)
all(syn_data$true_beta[-c(1:5)]==0)