Type: | Package |
Title: | Efficient Implementation of Gaussian Process in Bayesian Hierarchical Models |
Version: | 0.1.3 |
Description: | Implements Bayesian hierarchical models with flexible Gaussian process priors, focusing on Extended Latent Gaussian Models and incorporating various Gaussian process priors for Bayesian smoothing. Computations leverage finite element approximations and adaptive quadrature for efficient inference. Methods are detailed in Zhang, Stringer, Brown, and Stafford (2023) <doi:10.1177/09622802221134172>; Zhang, Stringer, Brown, and Stafford (2024) <doi:10.1080/10618600.2023.2289532>; Zhang, Brown, and Stafford (2023) <doi:10.48550/arXiv.2305.09914>; and Stringer, Brown, and Stafford (2021) <doi:10.1111/biom.13329>. |
License: | GPL (≥ 3) |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.3.2 |
LinkingTo: | TMB (≥ 1.9.7), RcppEigen |
Imports: | TMB (≥ 1.9.7), numDeriv, rstan, sfsmisc, Matrix (≥ 1.6.3), aghq (≥ 0.4.1), fda, tmbstan, LaplacesDemon, methods |
Depends: | R (≥ 3.6.0) |
Suggests: | rmarkdown, knitr, survival, testthat (≥ 3.0.0) |
VignetteBuilder: | knitr |
NeedsCompilation: | yes |
Packaged: | 2024-11-10 22:31:24 UTC; ziangzhang |
Config/testthat/edition: | 3 |
Author: | Ziang Zhang [aut, cre], Yongwei Lin [aut], Alex Stringer [aut], Patrick Brown [aut] |
Maintainer: | Ziang Zhang <ziangzhang@uchicago.edu> |
Repository: | CRAN |
Date/Publication: | 2024-11-12 14:20:03 UTC |
The monthly all-cause mortality for male with age less than 40 in Pennsylvania.
Description
The monthly all-cause mortality for male with age less than 40 in Pennsylvania.
Usage
PEN_death
Format
'PEN_death' A data frame with 299 rows of observations.
A simulated dataset from the case-crossover model.
Description
A simulated dataset from the case-crossover model.
Usage
ccData
Format
'ccData' A data frame with 3596 rows and 6 columns.
Compute the SD correction factor for sgp
Description
Compute the SD correction factor for sgp
Usage
compute_d_step_sgpsd(d, a)
Arguments
d |
A numeric value for the prediction step. |
a |
The frequency parameter of the sgp. |
Value
The correction factor c that should be used to compute the d-step PSD as c*SD.
Computing the posterior samples of the function or its derivative using the posterior samples of the basis coefficients for iwp
Description
Computing the posterior samples of the function or its derivative using the posterior samples of the basis coefficients for iwp
Usage
compute_post_fun_iwp(
samps,
global_samps = NULL,
knots,
refined_x,
p,
degree = 0,
intercept_samps = NULL
)
Arguments
samps |
A matrix that consists of posterior samples for the O-spline basis coefficients. Each column represents a particular sample of coefficients, and each row is associated with one basis function. This can be extracted using 'sample_marginal' function from 'aghq' package. |
global_samps |
A matrix that consists of posterior samples for the global basis coefficients. If NULL, assume there will be no global polynomials and the boundary conditions are exactly zero. |
knots |
A vector of knots used to construct the O-spline basis, first knot should be viewed as "0", the reference starting location. These k knots will define (k-1) basis function in total. |
refined_x |
A vector of locations to evaluate the O-spline basis |
p |
An integer value indicates the order of smoothness |
degree |
The order of the derivative to take, if zero, implies to consider the function itself. |
intercept_samps |
A matrix that consists of posterior samples for the intercept parameter. If NULL, assume the function evaluated at zero is zero. |
Value
A data.frame that contains different samples of the function or its derivative, with the first column being the locations of evaluations x = refined_x.
Examples
knots <- c(0, 0.2, 0.4, 0.6)
samps <- matrix(rnorm(n = (3 * 10)), ncol = 10)
result <- compute_post_fun_iwp(samps = samps, knots = knots, refined_x = seq(0, 1, by = 0.1), p = 2)
plot(result[, 2] ~ result$x, type = "l", ylim = c(-0.3, 0.3))
for (i in 1:9) {
lines(result[, (i + 1)] ~ result$x, lty = "dashed", ylim = c(-0.1, 0.1))
}
global_samps <- matrix(rnorm(n = (2 * 10), sd = 0.1), ncol = 10)
Computing the posterior samples of the function using the posterior samples of the basis coefficients for sGP
Description
Computing the posterior samples of the function using the posterior samples of the basis coefficients for sGP
Usage
compute_post_fun_sgp(
samps,
global_samps = NULL,
k,
refined_x,
a,
region,
boundary = TRUE,
m,
intercept_samps = NULL,
initial_location = NULL
)
Arguments
samps |
A matrix that consists of posterior samples for the O-spline basis coefficients. Each column represents a particular sample of coefficients, and each row is associated with one basis function. This can be extracted using 'sample_marginal' function from 'aghq' package. |
global_samps |
A matrix that consists of posterior samples for the global basis coefficients. If NULL, assume there will be no global polynomials and the boundary conditions are exactly zero. |
k |
The number of the sB basis. |
refined_x |
A vector of locations to evaluate the sB basis |
a |
The frequency of sGP. |
region |
The region to define the sB basis |
boundary |
A boolean variable to indicate whether the boundary condition should be considered in the prediction. |
m |
The number of harmonics to consider |
intercept_samps |
A matrix that consists of posterior samples for the intercept parameter. If NULL, assume there is no intercept samples to adjust. |
initial_location |
The initial location of the sGP. |
Value
A data.frame that contains different samples of the function, with the first column being the locations of evaluations x = refined_x.
Constructing the precision matrix given the knot sequence
Description
Constructing the precision matrix given the knot sequence
Usage
compute_weights_precision(object)
Arguments
object |
A fitted model object. |
Value
A precision matrix of the corresponding basis function, should be diagonal matrix with size (k-1) by (k-1).
Constructing the precision matrix given the knot sequence (helper)
Description
Constructing the precision matrix given the knot sequence (helper)
Usage
compute_weights_precision_helper(x)
Arguments
x |
A vector of knots used to construct the O-spline basis, first knot should be viewed as "0", the reference starting location. These k knots will define (k-1) basis function in total. |
Value
A precision matrix of the corresponding basis function, should be diagonal matrix with size (k-1) by (k-1).
Examples
compute_weights_precision_helper(x = c(0,0.2,0.4,0.6,0.8))
The COVID-19 daily death data in Canada.
Description
A subset of the the COVID-19 daily death data collected between 2020 to 2022. The data is obtained from COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (Dong et al., 2020).
Usage
covid_canada
Format
'covid_canada' A data frame with 787 rows and 5 columns:
- Date
The date of the measurement.
- new_deaths
The number of new deaths at that date.
- t
The converted numerical value of 'Date'.
- weekdays 1-6
Coded as 1 if the date is the corresponding weekday, -1 else if the date is on Sunday, and 0 otherwise.
- index
The index of that observation.
Source
Dong, E., H. Du, and L. Gardner (2020). An interactive web-based dashboard to track covid-19 in real time. The Lancet infectious diseases 20 (5), 533–534.
Custom Template Function
Description
This function allows for the dynamic modification of a C++ template within the BayesGP package. Users can specify custom content for the log-likelihood as well as the log-prior of the SD parameter in the template.
Usage
custom_template(
SETUP = NULL,
LOG_LIKELIHOOD = NULL,
LOG_PRIOR = NULL,
compile_template = FALSE
)
Arguments
SETUP |
A character string or vector containing the lines of C++ code to be inserted before the computation of log-likelihood. |
LOG_LIKELIHOOD |
A character string or vector containing the lines of C++ code to be inserted in the log-likelihood section of the template. Should be NULL if no changes are to be made to this section. |
LOG_PRIOR |
A character string or vector containing the lines of C++ code to be inserted in the log-prior (of the SD parameter) section of the template. Should be NULL if no changes are to be made to this section. |
compile_template |
A indicator of whether the new template should be compiled. default is FALSE. |
Value
A string representing the path to the temporary .so (or .dll) file containing the compiled custom C++ code.
Roxygen commands
Description
Roxygen commands
Usage
dummy()
Construct posterior inference given samples
Description
Construct posterior inference given samples
Usage
extract_mean_interval_given_samps(samps, level = 0.95, quantiles = NULL)
Arguments
samps |
Posterior samples of f or its derivative, with the first column being evaluation points x. This can be yielded by 'compute_post_fun_iwp' function. |
level |
The level to compute the pointwise interval. Ignored when quantiles are provided. |
quantiles |
A numeric vector of quantiles to be computed. |
Value
A dataframe with a column for evaluation locations x, and posterior mean and pointwise intervals at that set of locations.
Function defined to enhance the usability for users on IDEs.
Description
Function defined to enhance the usability for users on IDEs.
Usage
f(
smoothing_var,
model = "iid",
sd.prior = NULL,
boundary.prior = NULL,
initial_location = c("middle", "left", "right")
)
Arguments
smoothing_var |
The variable name of the smoothing variable. |
model |
The name of the smoothing model. |
sd.prior |
A list/vector that specifies the prior of the sd parameter. |
boundary.prior |
A list/vector that specifies the prior of the boundary parameter. |
initial_location |
A character/number that specifies the initial location of the smoothing variable. |
Value
A call object that can be used in the model_fit formula to indicate a smooth term or random effect.
Get default options for MCMC implementation
Description
This function takes an optional list of options and fills in any missing values with a set of default MCMC options.
Usage
get_default_option_list_MCMC(option_list = list())
Arguments
option_list |
A list of options to be passed to the MCMC. If some options are missing, the function will use default values. |
Value
A list containing the complete set of options with defaults where necessary.
Examples
# Example: Get the default option list
options <- get_default_option_list_MCMC()
print(options)
Constructing and evaluating the global polynomials, to account for boundary conditions (design matrix)
Description
Constructing and evaluating the global polynomials, to account for boundary conditions (design matrix)
Usage
global_poly_helper(x, p = 2)
Arguments
x |
A vector of locations to evaluate the global polynomials |
p |
An integer value indicates the order of smoothness |
Value
A matrix with i,j componet being the value of jth basis function value at ith element of x, the ncol should equal to p, and nrow should equal to the number of elements in x
Examples
global_poly_helper(x = c(0, 0.2, 0.4, 0.6, 0.8), p = 2)
Constructing and evaluating the global polynomials, to account for boundary conditions (design matrix) of sgp
Description
Constructing and evaluating the global polynomials, to account for boundary conditions (design matrix) of sgp
Usage
global_poly_helper_sgp(refined_x, a, m, initial_location = NULL)
Arguments
refined_x |
A vector of locations to evaluate the sB basis |
a |
The frequency of sgp. |
m |
The number of harmonics to consider |
initial_location |
The value of the initial location. If NULL, the minimum value of refined_x will be used. |
Value
A matrix with i,j componet being the value of jth basis function value at ith element of x, the ncol should equal to (2*m), and nrow should equal to the number of elements in x
Constructing and evaluating the local O-spline basis (design matrix)
Description
Constructing and evaluating the local O-spline basis (design matrix)
Usage
local_poly_helper(knots, refined_x, p = 2, neg_sign_order = 0)
Arguments
knots |
A vector of knots used to construct the O-spline basis, first knot should be viewed as "0", the reference starting location. These k knots will define (k-1) basis function in total. |
refined_x |
A vector of locations to evaluate the O-spline basis |
p |
An integer value indicates the order of smoothness |
neg_sign_order |
An integer value N such that D = ((-1)^N)*D for the splines at negative knots. Default is 0. |
Value
A matrix with i,j component being the value of jth basis function value at ith element of refined_x, the ncol should equal to number of knots minus 1, and nrow should equal to the number of elements in refined_x.
Examples
local_poly_helper(knots = c(0, 0.2, 0.4, 0.6, 0.8), refined_x = seq(0, 0.8, by = 0.1), p = 2)
Model fitting with random effects/fixed effects
Description
Fitting a hierarchical model based on the provided formula, data and parameters such as type of method and family of response. Returning the S4 objects for the random effects, concatenated design matrix for the intercepts and fixed effects, fitted model, indexes to partition the posterior samples.
Usage
model_fit(
formula,
data,
method = "aghq",
family = "gaussian",
control.family,
control.fixed,
aghq_k = 4,
size = NULL,
cens = NULL,
weight = NULL,
strata = NULL,
M = 3000,
customized_template = NULL,
Customized_RE = NULL,
option_list = list(),
envir = parent.frame(),
extra_theta_num = NULL
)
Arguments
formula |
A formula that contains one response variable, and covariates with either random or fixed effect. |
data |
A dataframe that contains the response variable and other covariates mentioned in the formula. |
method |
The inference method used in the model. By default, the method is set to be "aghq". |
family |
The family of response used in the model. By default, the family is set to be "gaussian". |
control.family |
Parameters used to specify the priors for the family parameters, such as the standard deviation parameter of Gaussian family. For example control.family = 1 in the Gaussian family corresponds to an Exponential prior to the standard deviation parameter of the Gaussian noise with median 1. When left unspecified, the default prior is an Exponential prior with median 1. |
control.fixed |
Parameters used to specify the priors for the fixed effects. For example control.fixed = list(intercept = list(prec = 0.001, mean = 0)) will setup the prior N(0,1/0.001) for the intercept parameter. When left unspecified, all fixed effect parameters will be assigned independent N(0,1/0.001) priors. |
aghq_k |
An integer to specify the number of quadrature points used in the aghq method. By default, the value is 4. |
size |
The name of the size variable, should be one of the variables in 'data'. The default value is "NULL", corresponding to a vector of 1s. This is only used for the Binomial family, and denotes the number of binomial trails. |
cens |
The name of the right-censoring indicator, should be one of the variables in 'data'. The default value is "NULL". This is only used for the CoxPH family. |
weight |
The name of the weight variable, should be one of the variables in 'data'. The default value is "NULL", corresponding to a vector of 1s. This is only used for the Case-Crossover family, and denotes the weight of each observation. |
strata |
The name of the strata variable, should be one of the variables in 'data'. The default value is "NULL". This is only used for the Case-Crossover family, and denotes the strata of each observation. |
M |
The number of posterior samples to be taken, by default is 3000. |
customized_template |
The name of the customized cpp template that the user wants to use instead. By default this is NULL, and the cpp template 'BayesGP' will be used. |
Customized_RE |
The list that contains the compute_B and compute_P functions for the customized random effect. By default, this is NULL and there is not customized random effect in the model. |
option_list |
A list that controls the details of the inference algorithm, by default is an empty list. |
envir |
The environment in which the formula and other expressions are to be evaluated. Defaults to 'parent.frame()', which refers to the environment from which the function was called. This allows the function to access variables that are defined in the calling function's scope. |
extra_theta_num |
An integer number to indicate the extra number of parameters required in the 'theta' vector. Should only be specified when using customized template. |
Value
A list that contains following items: the S4 objects for the random effects (instances), concatenated design matrix for the fixed effects (design_mat_fixed), fitted aghq (mod) and indexes to partition the posterior samples (boundary_samp_indexes, random_samp_indexes and fixed_samp_indexes).
Repeated fitting Bayesian Hierarchical Models for a sequence of values of the looping variable.
Description
Performs repeated model fitting over a sequence of values for a specified variable within a hierarchical model. This function repeatedly fits a model for each value of the looping variable, compiles the log marginal likelihoods, and calculates the posterior probabilities for the variable's values.
Usage
model_fit_loop(
loop_holder = "LOOP",
loop_values,
prior_func = function(x) {
1
},
parallel = FALSE,
cores = (parallel::detectCores() - 1),
...
)
Arguments
loop_holder |
A string specifying the name of the variable to loop over. The default value is 'LOOP'. |
loop_values |
A numeric vector containing the values to loop over for the specified variable. |
prior_func |
A function that takes the specified loop_values and returns the values of the prior for the loop variable. By default, it is a uniform prior which returns a constant value, indicating equal probability for all values. |
parallel |
Logical, indicating whether or not to run the model fitting in parallel (default is FALSE). |
cores |
The number of cores to use for parallel execution (default is detected cores - 1). |
... |
Additional arguments passed to the model fitting function 'model_fit'. |
Value
A data frame containing the values of the looping variable, their corresponding log marginal likelihoods, and posterior probabilities.
Obtain the posterior and prior density of all the parameters in the fitted model
Description
Obtain the posterior and prior density of all the parameters in the fitted model
Usage
para_density(object)
Arguments
object |
The fitted object from the function 'model_fit'. |
Value
A list of data frames, each data frame contains the posterior density of the corresponding parameter.
Obtain the posterior summary table for all the parameters in the fitted model
Description
Obtain the posterior summary table for all the parameters in the fitted model
Usage
post_table(object, quantiles = c(0.025, 0.975), digits = 3)
Arguments
object |
The fitted object from the function 'model_fit'. |
quantiles |
The specified quantile to display the posterior summary, default is c(0.025, 0.975). |
digits |
The significant digits to be kept in the result, default is 3. |
Value
A data frame that contains the posterior summary of all the parameters in the fitted model.
To predict the GP component in the fitted model, at the locations specified in 'newdata'.
Description
To predict the GP component in the fitted model, at the locations specified in 'newdata'.
Usage
## S3 method for class 'FitResult'
predict(
object,
newdata = NULL,
variable,
deriv = 0,
include.intercept = TRUE,
only.samples = FALSE,
quantiles = c(0.025, 0.5, 0.975),
boundary.condition = "Yes",
...
)
Arguments
object |
The fitted object from the function 'model_fit'. |
newdata |
The dataset that contains the locations to be predicted for the specified GP. Its column names must include 'variable'. |
variable |
The name of the variable to be predicted, should be in the 'newdata'. |
deriv |
The degree of derivative that the user specifies for inference. Only applicable for a GP in the 'iwp' type. |
include.intercept |
A logical variable specifying whether the intercept should be accounted when doing the prediction. The default is TRUE. For Coxph model, this variable will be forced to FALSE. |
only.samples |
A logical variable indicating whether only the posterior samples are required. The default is FALSE, and the summary of posterior samples will be reported. |
quantiles |
A numeric vector of quantiles that predict.FitResult will produce, the default is c(0.025, 0.5, 0.975). |
boundary.condition |
A string specifies whether the boundary.condition should be considered in the prediction, should be one of c("yes", "no", "only"). The default option is "Yes". |
... |
Other arguments to be passed to the function. |
Value
A data.frame that contains the posterior mean and pointwise intervals (or posterior samples) at the locations specified in 'newdata'.
Construct prior based on d-step prediction SD (for iwp)
Description
Construct prior based on d-step prediction SD (for iwp)
Usage
prior_conversion_iwp(d, prior, p)
Arguments
d |
A numeric value for the prediction step. |
prior |
A list that contains alpha and u. This specifies the target prior on the d-step SD |
p |
An integer for the order of iwp. |
Value
A list that contains alpha and u. The prior for the smoothness parameter \sigma
such that P(\sigma > u) = alpha
, that yields the ideal prior on the d-step SD.
Construct prior based on d-step prediction SD (for sgp)
Description
Construct prior based on d-step prediction SD (for sgp)
Usage
prior_conversion_sgp(d, prior, freq, period, a, m = 1)
Arguments
d |
A numeric value for the prediction step. |
prior |
A list that contains alpha and u. This specifies the target prior on the d-step SD |
freq |
The frequency of the sgp, ignored if a is provided. |
period |
The period of the sgp, ignored if a or freq is provided. |
a |
The frequency parameter of the sgp. |
m |
The number of harmonics that should be considered, by default m = 1 represents only the sgp. |
Value
A list that contains alpha and u. The prior for the smoothness parameter \sigma
such that P(\sigma > u) = alpha
, that yields the ideal prior on the d-step SD.
Extract the posterior samples from the fitted model for the target fixed variables.
Description
Extract the posterior samples from the fitted model for the target fixed variables.
Usage
sample_fixed_effect(model_fit, variables)
Arguments
model_fit |
The result from model_fit(). |
variables |
A vector of names of the target fixed variables to sample. |
Value
A matrix with columns being the posterior samples of the target fixed effect variables.
Obtain the posterior density of a SD parameter in the fitted model
Description
Obtain the posterior density of a SD parameter in the fitted model
Usage
sd_density(
object,
component = NULL,
h = NULL,
theta_logprior = NULL,
MCMC_samps_only = FALSE
)
Arguments
object |
The fitted object from the function 'model_fit'. |
component |
The component of the SD parameter that you want to show. By default this is 'NULL', indicating the family.sd is of interest. |
h |
For PSD, the unit of predictive step to consider, by default is set to 'NULL', indicating the result is using the same 'h' as in the model fitting. |
theta_logprior |
The log prior function used on the selected SD parameter. By default is 'NULL', and the current Exponential prior will be used. |
MCMC_samps_only |
For model fitted with MCMC, whether only the posterior samples are needed. |
Value
A data.frame that contains the posterior and prior densities of the SD parameter. The SD parameter will be converted into PSD if applicable.
Plot the posterior density of a SD parameter in the fitted model
Description
Plot the posterior density of a SD parameter in the fitted model
Usage
sd_plot(object, component = NULL, h = NULL, theta_logprior = NULL)
Arguments
object |
The fitted object from the function 'model_fit'. |
component |
The component of the SD parameter that you want to show. By default this is 'NULL', indicating the family.sd is of interest. |
h |
For PSD, the unit of predictive step to consider, by default is set to 'NULL', indicating the result is using the same 'h' as in the model fitting. |
theta_logprior |
The log prior function used on the selected SD parameter. By default is 'NULL', and the current Exponential prior will be used. |
Value
A plot that shows the posterior and prior density of the SD parameter. The SD parameter will be converted into PSD if applicable.