Title: | Regression Models for Bounded Continuous and Discrete Responses |
Version: | 1.3.1 |
Author: | Roberto Ascari [aut, cre], Agnese M. Di Brisco [aut], Sonia Migliorati [aut], Andrea Ongaro [aut] |
Maintainer: | Roberto Ascari <roberto.ascari@unimib.it> |
Description: | Functions to fit regression models for bounded continuous and discrete responses. In case of bounded continuous responses (e.g., proportions and rates), available models are the flexible beta (Migliorati, S., Di Brisco, A. M., Ongaro, A. (2018) <doi:10.1214/17-BA1079>), the variance-inflated beta (Di Brisco, A. M., Migliorati, S., Ongaro, A. (2020) <doi:10.1177/1471082X18821213>), the beta (Ferrari, S.L.P., Cribari-Neto, F. (2004) <doi:10.1080/0266476042000214501>), and their augmented versions to handle the presence of zero/one values (Di Brisco, A. M., Migliorati, S. (2020) <doi:10.1002/sim.8406>) are implemented. In case of bounded discrete responses (e.g., bounded counts, such as the number of successes in n trials), available models are the flexible beta-binomial (Ascari, R., Migliorati, S. (2021) <doi:10.1002/sim.9005>), the beta-binomial, and the binomial are implemented. Inference is dealt with a Bayesian approach based on the Hamiltonian Monte Carlo (HMC) algorithm (Gelman, A., Carlin, J. B., Stern, H. S., Rubin, D. B. (2014) <doi:10.1201/b16018>). Besides, functions to compute residuals, posterior predictives, goodness of fit measures, convergence diagnostics, and graphical representations are provided. |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Encoding: | UTF-8 |
LazyData: | true |
Biarch: | true |
Depends: | R (≥ 3.5.0) |
Imports: | methods, Rcpp (≥ 0.12.0), rstan (≥ 2.26.0), rstantools (≥ 2.0.0), loo, bayesplot, ggplot2, Formula, utils, grDevices, RcppParallel |
LinkingTo: | BH (≥ 1.66.0), Rcpp (≥ 0.12.0), RcppEigen (≥ 0.3.3.3.0), rstan (≥ 2.26.0), StanHeaders (≥ 2.26.0), RcppParallel |
SystemRequirements: | GNU make |
RoxygenNote: | 7.3.2 |
Suggests: | testthat |
NeedsCompilation: | yes |
Packaged: | 2025-04-11 18:11:58 UTC; ascari |
Repository: | CRAN |
Date/Publication: | 2025-04-14 10:40:05 UTC |
The ‘FlexReg’ package.
Description
The FlexReg package provides functions and methods to implement several types of regression models for bounded continuous responses (e.g., proportions and rates) and bounded discrete responses (e.g., number of successes in n trials). Inferential statistical analysis is dealt with by a Bayesian estimation procedure based on the Hamiltonian Monte Carlo (HMC) algorithm through the rstan package.
Author(s)
Maintainer: Roberto Ascari roberto.ascari@unimib.it
Authors:
Agnese M. Di Brisco
Sonia Migliorati
Andrea Ongaro
References
Ascari, R., Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895–3914. doi:10.1002/sim.9005
Di Brisco, A. M., Migliorati, S. (2020). A new mixed-effects mixture model for constrained longitudinal data. Statistics in Medicine, 39(2), 129–145. doi:10.1002/sim.8406
Di Brisco, A. M., Migliorati, S., Ongaro, A. (2020). Robustness against outliers: A new variance inflated regression model for proportions. Statistical Modelling, 20(3), 274–309. doi:10.1177/1471082X18821213
Ferrari, S.L.P., Cribari-Neto, F. (2004). Beta Regression for Modeling Rates and Proportions. Journal of Applied Statistics, 31(7), 799–815. doi:10.1080/0266476042000214501
Migliorati, S., Di Brisco, A. M., Ongaro, A. (2018). A New Regression Model for Bounded Responses. Bayesian Analysis, 13(3), 845–872. doi:10.1214/17-BA1079
Stan Development Team (2020). RStan: the R interface to Stan. R package version 2.19.3. https://mc-stan.org
Atomic bombs data
Description
Count/Percentage of chromosome aberrations in atomic bombs survivors.
Format
A data.frame
containing 1039 observations on the following 5 variables:
y.perc
the percentage of cells with chromosomal abnormalities.
y
the number of cells with chromosomal abnormalities.
n
the number of analyzed cells. It is fixed to 100 for all the survivors.
dose
a quantitative measure of the radiation exposure level, expressed in rads.
bomb
a factor, indicating which bomb the subject survived (H = Hiroshima, N = Nagasaki).
Details
The data have been originally analyzed by Otake and Prentice (1984) and successively by Ascari and Migliorati (2021).
References
Ascari, R., Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895–3914. doi:10.1002/sim.9005
Otake, M., Prentice, R.L. (1984). The analysis of chromosomally aberrant cells based on beta-binomial distribution. Radiat Res. 98, 456–470.
Bacteria data
Description
Count/Percentage of eggs parasitized by female parasitoids.
Format
A data.frame
containing 70 observations on the following 5 variables:
y.perc
the percentage of parasitized eggs.
y
the number of parasitized eggs.
n
the maximum number of eggs that female parasitoids could parasitized. It is fixed to 128 for all the observations.
females
the number of female parasitoids.
females_std
the standardized version of
females
.
Details
The data have been originally analyzed by Demétrio et al (2014) and successively by Ascari and Migliorati (2021). Data come from a completely randomized experiment with 10 replicates for each specification of number of females.
Source
Demétrio et al., (2014). Models for overdispersed data in entomology.
References
Ascari, R., Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895–3914. doi:10.1002/sim.9005
Demétrio, C.G.B., Hinde, J., Moral, R.A. (2014). Models for overdispersed data in entomology. Ecological Modelling Applied to Entomology. Entomology in Focus Switzerland: Springer International Publishing, 219–259.
Italian Households Consumption data
Description
This dataset is a subset from the 2016 Survey on Household Income and Wealth data, a statistical survey conducted by the Bank of Italy. The statistical units are the households and the head of the household is conventionally selected as the major income earner.
Format
A data.frame
containing 568 observations on the following 8 variables:
NComp
the number of household members.
Sex
the sex of the head of the household.
Age
the age of the head of the household.
NEarners
the number of household income earners.
Area
a factor indicating the geographical area where the household is located.
Citizenship
a factor indicating the citizenship of the head of household.
Income
the net disposable income.
Consumption
the propensity to consume, defined as the percentage of
Income
that is spent rather than saved.
Details
Full data are available on the website of the Bank of Italy. Consumption
is computed as the ratio between the amount of ‘consumption’ over the ‘net disposable income’.
Source
Bank of Italy, Survey on Household Income and Wealth, 2016.
Survey description.
Italian Election Results
Description
Data from the Italian general election held on 4 March 2018.
Format
A data.frame
containing 232 observations on the following 13 variables:
NVotes
the number of valid votes.
FI
the percentage of votes got by ‘Forza Italia’ party.
FDI
the percentage of votes got by ‘Fratelli d’Italia' party.
LEGA
the percentage of votes got by ‘Lega’ party.
LEU
the percentage of votes got by ‘Liberi e Uguali’ party.
M5S
the percentage of votes got by ‘Movimento 5 Stelle’ party.
PD
the percentage of votes got by ‘Partito Democratico’ party.
Other
the percentage of votes got by other parties, including blank ballots.
AgeInd
the age index, defined as the ratio of the number of elderly persons (aged 65 and over) to the number of young persons (from 0 to 14), divided by 10.
PopDens
the number of inhabitants per square km.
ER
the employment rate, defined as the ratio of the number of employed persons (aged 15-64) to the number of persons (aged 15-64).
Illiteracy
the illiteracy rate, defined as the ratio of the number of persons without a qualification (aged 15 and over) to the total number of persons aged 15 and over.
Foreign
the number of foreigners per 1000 inhabitants.
Details
Data are collected on the 232 electoral districts into which the Italian territory is organized. Distribution of votes for Aosta constituency is not available. Distributions of votes are available on the Italian Ministry of Interior’s webpage, whereas constituencies information have been obtained from 2011 Italian Census. The count of votes got by each party can be derived by multiplying the percentage of votes and the number of valid votes.
Source
Italian Ministry of Interior’s webpage: https://www.interno.gov.it/it/speciali/2018-elections.
Bayesian R-squared for flexreg
Objects
Description
Bayesian version of R-squared for flexible regression models for bounded continuous and discrete responses.
Usage
R2_bayes(model)
Arguments
model |
an object (or a list of objects) of class |
Details
The function provides a Bayesian version of the R-squared measure, defined as the variance of the predicted values divided by itself plus the expected variance of the errors.
Value
A list with the same length as the number of objects of class `flexreg`
passed in the model
argument.
Each element of the list contains a vector of Bayesian R-squared values with the same length as the Markov Chain(s) after warmup.
References
Gelman, A., Goodrich, B., Gabry, J., Vehtari, A. (2019). R-squared for Bayesian Regression Models, The American Statistician, 73:3, 307–309. doi: 10.1080/00031305.2018.1549100
Examples
## Not run:
data("Reading")
FB <- flexreg(accuracy.adj ~ iq, data = Reading, type = "FB", n.iter=1000)
hist(R2_bayes(FB)[[1]])
## End(Not run)
Reading Skills data
Description
Data for assessing the contribution of non-verbal IQ to children's reading skills in dyslexic and non-dyslexic children.
Format
A data.frame
containing 44 observations on 4 variables.
accuracy
a reading score.
accuracy.adj
the adjusted reading score: the observed 1's (perfect reading scores) are substituted with 0.99.
dyslexia
a factor indicating wheter the child is dyslexic.
iq
a standardized quantitative measure of the children's non verbal abilities.
Details
The data were originally analyzed by Pammer and Kevan (2004) and successively used by Smithson and Verkuilen (2006) and by Migliorati et al. (2018).
Source
References
Cribari-Neto, F., Zeileis, A. (2010). Beta Regression in R. Journal of Statistical Software, 34(2), 1–24.
Di Brisco, A. M., Migliorati, S. (2020). A new mixed-effects mixture model for constrained longitudinal data. Statistics in Medicine, 39(2), 129–145. doi:10.1002/sim.8406
Migliorati, S., Di Brisco, A. M., Ongaro, A. (2018). A New Regression Model for Bounded Responses. Bayesian Analysis, 13(3), 845–872. doi:10.1214/17-BA1079
Smithson, M., Verkuilen, J. (2006). A Better Lemon Squeezer? Maximum-Likelihood Regression with Beta-Distributed Dependent Variables. Psychological Methods, 11(7), 54–71.
Stress and anxiety data
Description
Data for assessing the dependency between stress and anxiety in nonclinical women in Townsville, Queensland, Australia.
Format
A data.frame
containing 166 observations on the following 2 variables:
stress
defined as rate.
anxiety
defined as rate.
Details
Both variables are rates obtained as linear transformations from the Depression Anxiety Stress Scales which range from 0 to 42 (Lovibond & Lovibond, 1995). Additional details can be found in Example 2 from Smithson and Verkuilen (2006).
Source
Example 2 from Smithson and Verkuilen (2006).
References
Lovibond, P. F., Lovibond, S. H. (1995). The structure of negative emotional states: Comparison of the Depression Anxiety Stress Scales (DASS) with the Beck Depression and Anxiety Inventories. Behaviour research and therapy, 33(3), 335–343.
Smithson, M., Verkuilen, J. (2006). A Better Lemon Squeezer? Maximum-Likelihood Regression with Beta-Distributed Dependent Variables. Psychological Methods, 11(7), 54–71.
WAIC and LOO
Description
The function computes widely applicable information criterion (WAIC) and efficient approximate leave-one-out cross-validation (LOO) from fitted regression model objects of class `flexreg`
.
Usage
WAIC(model, ...)
## S3 method for class 'WAIC.flexreg'
print(x, ...)
Arguments
model |
an object (or a list of objects) of class |
... |
additional arguments. |
x |
an object of class |
Details
This function takes advantage of the loo package to compute the widely applicable information criterion (WAIC) and leave-one-out cross-validation (LOO) for objects of class `flexreg`
.
If a list of two or more objects of class `flexreg`
is provided, the function returns the difference in their expected predictive accuracy (see loo_compare
for further details).
Value
A named list with components from loo
and waic
.
References
Vehtari, A., Gelman, A., Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing. 27(5), 1413–1432. doi:10.1007/s11222-016-9696-4
Examples
## Not run:
data("Reading")
FB <- flexreg(accuracy.adj ~ iq, data = Reading, type="FB", n.iter=1000)
WAIC(FB)
## End(Not run)
Convergence diagnostics
Description
The function returns some diagnostic measures to check for convergence to the equilibrium distribution of the Markov Chain(s). Moreover, it prints the number (and percentage) of iterations that ended with a divergence and that saturated the max treedepth, and the E-BFMI values for each chain for which E-BFMI is less than 0.2.
Usage
convergence.diag(
model,
diagnostics = "all",
pars = NULL,
additional.args = list()
)
Arguments
model |
an object of class |
diagnostics |
an optional character vector of diagnostics names. The default is to compute |
pars |
an optional character vector of parameter names. If |
additional.args |
a list containing additional arguments (see details) |
Details
-
"Rhat"
returns the potential scale reduction factor on split chains. An R-hat greater than 1 is indicative of a bad mix of the chains. At convergence R-hat has to be less than 1.05. Seerstan::Rhat
for further details. -
"geweke"
returns the z-scores, one for each parameter, for a test of equality between the means of the first 10% and last 50% of the chain. The fraction to use from the first and last part of the chain can be edited through the additional argumentsfrac1
andfrac2
. The sum offrac1
andfrac2
has to be strictly less than 1. Seecoda::geweke.diag
for further details. -
"raftery"
returns the estimate of the "dependence factor"I
. Values ofI
greater than 5 may indicate a strong autocorrelation. Additional parameters such as the quantile to be estimated (q
), the desired margin of error of the estimate (r
), and the probability (s
) of obtaining an estimate betweenq-r
andq+r
can be passed as a list in theadditional.args
argument. Seecoda::raftery.diag
for further details. -
"heidel"
returns the p-values, one for each parameter, referred to a convergence test where the null hypothesis is that the sampled values come from a stationary distribution. It is possible to set the target value for ratio of halfwidth to sample mean (eps
) and the significance level of the test (pvalue
) into theadditional.args
argument. Seecoda::heidel.diag
for further details. -
"gelman"
returns the estimate of the potential scale reduction factor and the upper confidence limit. At least two chains are needed to compute the Gelman and Rubin's convergence diagnostic. Additional parameters such as the confidence level (confidence
), a logical flag indicating whether variables should be transformed (transform
), a logical flag indicating whether only the second half of the series should be used in the computation (autoburnin
), and a logical flag indicating whether the multivariate potential scale reduction factor should be calculated for multivariate chains (multivariate
) can be passed as a list in theadditional.args
argument. Seecoda::gelman.diag
for further details.
Value
A print from check_hmc_diagnostics
function and a list of convergence diagnostics.
References
Brooks, SP., Gelman, A. (1998). General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics, 7, 434-455.
Geweke, J. (1992). Evaluating the accuracy of sampling-based approaches to calculating posterior moments. In Bayesian Statistics 4 (ed JM Bernado, JO Berger, AP Dawid and AFM Smith). Clarendon Press, Oxford, UK.
Heidelberger P., Welch P.D. (1981). A spectral method for confidence interval generation and run length control in simulations. Comm. ACM. 24, 233-245.
Raftery, A.E. and Lewis, S.M. (1992). One long run with diagnostics: Implementation strategies for Markov chain Monte Carlo. Statistical Science, 7, 493-497.
Stan Development Team (2020). RStan: the R interface to Stan. R package version 2.19.3. https://mc-stan.org
Examples
## Not run:
data("Reading")
FB <- flexreg(accuracy.adj ~ iq, data = Reading, type = "FB")
convergence.diag(FB, diagnostics = c("Rhat", "geweke"), pars = "beta")
## End(Not run)
Convergence plots
Description
The function produces some convergence plots from the Monte Carlo draws.
Usage
convergence.plot(
model,
file = "convergence-output.pdf",
plotfun = "all",
pars = NULL,
point_est = "median",
prob = 0.5,
prob_outer = 0.9,
lags = 10,
warmup = F,
width = 7,
height = 7
)
Arguments
model |
an object of class |
file |
a character string giving the name of the file (including the extension .pdf) containing the convergence plots. If |
plotfun |
an optional character vector of diagnostics plots. The default is to compute |
pars |
an optional character vector of parameter names. If |
point_est |
an optional character to specify the point estimate to be shown between |
prob |
the probability mass to be included in the inner interval (for |
prob_outer |
the probability mass to be included in the outer interval of the |
lags |
the number of lags to be shown in the |
warmup |
a logical scalar indicating whether to include the warmup draws or not (default). |
width , height |
the width and height of the graphics region of each plot in inches. The default values are 7. |
Details
-
"density"
returns a density plot for each parameter inpars
computed from the posterior draws. Seebayesplot::mcmc_areas
for further details. -
"trace"
returns a trace plot for each parameter inpars
computed from the posterior draws. Seebayesplot::mcmc_trace
for further details. -
"intervals"
returns a plot of uncertainty interval for each parameter inpars
computed from the posterior draws. Seebayesplot::mcmc_intervals
for further details. -
"rate"
returns a plot for each parameter inpars
with the number of iterations on the x-axis and the Monte Carlo mean until iteration i-th on the y-axis. -
"rhat"
returns a plot with the Rhat values for each parameter inpars
. Seebayesplot::mcmc_rhat
for further details. -
"acf"
returns the autocorrelation plots (one for each parameter inpars
). Seebayesplot::mcmc_acf
for further details.
Moreover, the convergence plots can be further customized using the ggplot2 package.
Value
A .pdf file with one plot per page.
References
Brooks, SP., Gelman, A. (1998). General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics, 7, 434-455.
Stan Development Team (2020). RStan: the R interface to Stan. R package version 2.19.3. https://mc-stan.org
Examples
## Not run:
data("Reading")
FB <- flexreg(accuracy.adj ~ iq, data = Reading, type = "FB")
convergence.plot(FB, file = "Convergence_plot_Output.pdf", pars = "beta")
## End(Not run)
Draw density plots
Description
The function draws a curve corresponding to the probability density/mass function of the specified distribution (beta, flexible beta, variance-inflated beta, binomial, beta-binomial, or flexible beta-binomial). For beta, flexible beta, and variance-inflated beta, it also allows to include the representation of the probability of augmentation in zero and/or one values.
Usage
curve.density(
type = NULL,
size = NULL,
mu = NULL,
theta = NULL,
phi = NULL,
p = NULL,
w = NULL,
k = NULL,
q0 = NULL,
q1 = NULL,
...
)
Arguments
type |
a character specifying the distribution type to be plotted ( |
size |
the total number of trials (to be specified only if |
mu |
the mean parameter of the distribution. It must lie in (0, 1). |
theta |
the overdispersion parameter (to be specified only if |
phi |
the precision parameter (if |
p |
the mixing weight (to be specified only if |
w |
the normalized distance among component means of the FB and FBB distributions (to be specified only if |
k |
the extent of the variance inflation (to be specified only if |
q0 |
the probability of augmentation in zero (to be specified only if |
q1 |
the probability of augmentation in one (to be specified only if |
... |
additional arguments of |
References
Ascari, R., Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895–3914. doi:10.1002/sim.9005
Di Brisco, A. M., Migliorati, S. (2020). A new mixed-effects mixture model for constrained longitudinal data. Statistics in Medicine, 39(2), 129–145. doi:10.1002/sim.8406
Di Brisco, A. M., Migliorati, S., Ongaro, A. (2020). Robustness against outliers: A new variance inflated regression model for proportions. Statistical Modelling, 20(3), 274–309. doi:10.1177/1471082X18821213
Ferrari, S.L.P., and Cribari-Neto, F. (2004). Beta Regression for Modeling Rates and Proportions. Journal of Applied Statistics, 31(7), 799–815. doi:10.1080/0266476042000214501
Migliorati, S., Di Brisco, A. M., Ongaro, A. (2018). A New Regression Model for Bounded Responses. Bayesian Analysis, 13(3), 845–872. doi:10.1214/17-BA1079
Examples
## Not run:
curve.density("Beta", mu=.5, phi=20)
curve.density("Beta", mu=.5, phi=20, q1 = .3)
curve.density("FB", mu=.5, phi=20, p=.4, w=.8)
curve.density("FB", mu=.5, phi=20, p=.4, w=.8, q0= .1)
curve.density("VIB", mu=.5, phi=20, p=.9, k=.8, col=3)
curve.density("VIB", mu=.5, phi=20, p=.9, k=.8, col=3, q0=.1, q1=.3)
curve.density("Bin", size=10, mu=.7)
curve.density("BetaBin", size=10, mu=.7, phi=10)
curve.density("FBB", size=10, mu=.7, phi=10, p=.2,w=.7)
## End(Not run)
Probability density function of the beta distribution
Description
The function computes the probability density function of the beta distribution with a mean-precision parameterization. It can also compute the probability density function of the augmented beta distribution by assigning positive probabilities to zero and/or one values and a (continuous) beta density to the interval (0,1).
Usage
dBeta(x, mu, phi, q0 = NULL, q1 = NULL)
Arguments
x |
a vector of quantiles. |
mu |
the mean parameter. It must lie in (0, 1). |
phi |
the precision parameter. It must be a real positive value. |
q0 |
the probability of augmentation in zero. It must lie in (0, 1). In case of no augmentation, it is |
q1 |
the probability of augmentation in one. It must lie in (0, 1). In case of no augmentation, it is |
Details
The beta distribution has density
f_B(x;\mu,\phi)=\frac{\Gamma{(\phi)}}{\Gamma{(\mu\phi)}\Gamma{((1-\mu)\phi)}}x^{\mu\phi-1}(1-x)^{(1-\mu)\phi-1}
for 0<x<1
, where 0<\mu<1
identifies the mean and \phi>0
is the precision parameter.
The augmented beta distribution has density
-
q_0
, ifx=0
-
q_1
, ifx=1
-
(1-q_0-q_1)f_B(x;\mu,\phi)
, if0<x<1
where 0<q_0<1
identifies the augmentation in zero, 0<q_1<1
identifies the augmentation in one,
and q_0+q_1<1
.
Value
A vector with the same length as x
.
References
Ferrari, S.L.P., Cribari-Neto, F. (2004). Beta Regression for Modeling Rates and Proportions. Journal of Applied Statistics, 31(7), 799–815. doi:10.1080/0266476042000214501
Examples
dBeta(x = c(.5,.7,.8), mu = .3, phi = 20)
dBeta(x = c(.5,.7,.8), mu = .3, phi = 20, q0 = .2)
dBeta(x = c(.5,.7,.8), mu = .3, phi = 20, q0 = .2, q1= .1)
Probability mass function of the beta-binomial distribution
Description
The function computes the probability mass function of the beta-binomial distribution.
Usage
dBetaBin(x, size, mu, theta = NULL, phi = NULL)
Arguments
x |
a vector of quantiles. |
size |
the total number of trials. |
mu |
the mean parameter. It must lie in (0, 1). |
theta |
the overdispersion parameter. It must lie in (0, 1). |
phi |
the precision parameter, an alternative way to specify the overdispersion parameter |
Details
The beta-binomial distribution has probability mass function
f_{BB}(x;\mu,\phi)={n\choose x} \frac{\Gamma{(\phi)}}{\Gamma{(\mu\phi)}\Gamma{((1-\mu)\phi)}} \frac{\Gamma{(\mu\phi+x)}\Gamma{((1-\mu)\phi + n - x)}}{\Gamma{(\phi + n)}},
for x \in \lbrace 0, 1, \dots, n \rbrace
, where 0<\mu<1
identifies the mean and \phi=(1-\theta)/\theta >0
is the precision parameter.
Value
A vector with the same length as x
.
References
Ascari, R., Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895–3914. doi:10.1002/sim.9005
Examples
dBetaBin(x = 5, size = 10, mu = .3, phi = 10)
Probability density function of the flexible beta distribution
Description
The function computes the probability density function of the flexible beta distribution. It can also compute the probability density function of the augmented flexible beta distribution by assigning positive probabilities to zero and/or one values and a (continuous) flexible beta density to the interval (0,1).
Usage
dFB(x, mu, phi, p, w, q0 = NULL, q1 = NULL)
Arguments
x |
a vector of quantiles. |
mu |
the mean parameter. It must lie in (0, 1). |
phi |
the precision parameter. It must be a real positive value. |
p |
the mixing weight. It must lie in (0, 1). |
w |
the normalized distance among component means. It must lie in (0, 1). |
q0 |
the probability of augmentation in zero. It must lie in (0, 1). In case of no augmentation, it is |
q1 |
the probability of augmentation in one. It must lie in (0, 1). In case of no augmentation, it is |
Details
The FB distribution is a special mixture of two beta distributions with probability density function
f_{FB}(x;\mu,\phi,p,w)=p f_B(x;\lambda_1,\phi)+(1-p)f_B(x;\lambda_2,\phi),
for 0<x<1
, where f_B(x;\cdot,\cdot)
is the beta density with a mean-precision parameterization.
Moreover, 0<\mu=p\lambda_1+(1-p)\lambda_2<1
is the overall mean, \phi>0
is a precision parameter,
0<p<1
is the mixing weight, 0<w<1
is the normalized distance between component means, and
\lambda_1=\mu+(1-p)w
and \lambda_2=\mu-pw
are the means of the first and second component of the mixture, respectively.
The augmented FB distribution has density
-
q_0
, ifx=0
-
q_1
, ifx=1
-
(1-q_0-q_1)f_{FB}(x;\mu,\phi,p,w)
, if0<x<1
where 0<q_0<1
identifies the augmentation in zero, 0<q_1<1
identifies the augmentation in one,
and q_0+q_1<1
.
Value
A vector with the same length as x
.
References
Di Brisco, A. M., Migliorati, S. (2020). A new mixed-effects mixture model for constrained longitudinal data. Statistics in Medicine, 39(2), 129–145. doi:10.1002/sim.8406
Migliorati, S., Di Brisco, A. M., Ongaro, A. (2018). A New Regression Model for Bounded Responses. Bayesian Analysis, 13(3), 845–872. doi:10.1214/17-BA1079
Examples
dFB(x = c(.5,.7,.8), mu = .3, phi = 20, p = .5, w = .5)
dFB(x = c(.5,.7,.8), mu = .3, phi = 20, p = .5, w = .5, q0 = .2)
dFB(x = c(.5,.7,.8), mu = .3, phi = 20, p = .5, w = .5, q0 = .2, q1 = .1)
Probability mass function of the flexible beta-binomial distribution
Description
The function computes the probability mass function of the flexible beta-binomial distribution.
Usage
dFBB(x, size, mu, theta = NULL, phi = NULL, p, w)
Arguments
x |
a vector of quantiles. |
size |
the total number of trials. |
mu |
the mean parameter. It must lie in (0, 1). |
theta |
the overdispersion parameter. It must lie in (0, 1). |
phi |
the precision parameter, an alternative way to specify the overdispersion parameter |
p |
the mixing weight. It must lie in (0, 1). |
w |
the normalized distance among component means. It must lie in (0, 1). |
Details
The FBB distribution is a special mixture of two beta-binomial distributions with probability mass function
f_{FBB}(x;\mu,\phi,p,w) = p BB(x;\lambda_1,\phi)+(1-p)BB(x;\lambda_2,\phi),
for x \in \lbrace 0, 1, \dots, n \rbrace
, where BB(x;\cdot,\cdot)
is the beta-binomial distribution with a mean-precision parameterization.
Moreover, \phi=(1-\theta)/\theta>0
is a precision parameter, 0<p<1
is the mixing weight, 0<\mu=p\lambda_1+(1-p)\lambda_2<1
is the overall mean,
0<w<1
is the normalized distance between component means, and
\lambda_1=\mu+(1-p)w
and \lambda_2=\mu-pw
are the scaled means of the first and second component of the mixture, respectively.
Value
A vector with the same length as x
.
References
Ascari, R., Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895–3914. doi:10.1002/sim.9005
Examples
dFBB(x = c(5,7,8), size=10, mu = .3, phi = 20, p = .5, w = .5)
Probability density function of the variance-inflated beta distribution
Description
The function computes the probability density function of the variance-inflated beta distribution. It can also compute the probability density function of the augmented variance-inflated beta distribution by assigning positive probabilities to zero and/or one values and a (continuous) variance-inflated beta density to the interval (0,1).
Usage
dVIB(x, mu, phi, p, k, q0 = NULL, q1 = NULL)
Arguments
x |
a vector of quantiles. |
mu |
the mean parameter. It must lie in (0, 1). |
phi |
the precision parameter. It must be a real positive value. |
p |
the mixing weight. It must lie in (0, 1). |
k |
the extent of the variance inflation. It must lie in (0, 1). |
q0 |
the probability of augmentation in zero. It must lie in (0, 1). In case of no augmentation, it is |
q1 |
the probability of augmentation in one. It must lie in (0, 1). In case of no augmentation, it is |
Details
The VIB distribution is a special mixture of two beta distributions with probability density function
f_{VIB}(x;\mu,\phi,p,k)=p f_B(x;\mu,\phi k)+(1-p)f_B(x;\mu,\phi),
for 0<x<1
, where f_B(x;\cdot,\cdot)
is the beta density with a mean-precision parameterization.
Moreover, 0<p<1
is the mixing weight, 0<\mu<1
represents the overall (as well as mixture component)
mean, \phi>0
is a precision parameter, and 0<k<1
determines the extent of the variance inflation.
The augmented VIB distribution has density
-
q_0
, ifx=0
-
q_1
, ifx=1
-
(1-q_0-q_1)f_{VIB}(x;\mu,\phi,p,k)
, if0<x<1
where 0<q_0<1
identifies the augmentation in zero, 0<q_1<1
identifies the augmentation in one,
and q_0+q_1<1
.
Value
A vector with the same length as x
.
References
Di Brisco, A. M., Migliorati, S., Ongaro, A. (2020). Robustness against outliers: A new variance inflated regression model for proportions. Statistical Modelling, 20(3), 274–309. doi:10.1177/1471082X18821213
Examples
dVIB(x = c(.5,.7,.8), mu = .3, phi = 20, p = .5, k= .5)
dVIB(x = c(.5,.7,.8), mu = .3, phi = 20, p = .5, k= .5, q1 = .1)
dVIB(x = c(.5,.7,.8), mu = .3, phi = 20, p = .5, k= .5, q0 = .2, q1 = .1)
extract.pars
Description
extract.pars
Usage
extract.pars(posterior)
Flexible Regression Models for Bounded Continuous Responses
Description
The function fits some flexible regression models for bounded continuous responses (e.g., proportions and rates) via a Bayesian approach to inference based on Hamiltonian Monte Carlo algorithm.
Available regression models are the flexible beta regression model (type = "FB"
, default), the variance inflated beta (type = "VIB"
), the beta (type = "Beta"
), as well as their augmented versions.
Usage
flexreg(
formula,
zero.formula = NULL,
one.formula = NULL,
data,
type = "FB",
link.mu = "logit",
prior.beta = "normal",
hyperparam.beta = NULL,
prior.omega0 = "normal",
hyperparam.omega0 = NULL,
prior.omega1 = "normal",
hyperparam.omega1 = NULL,
link.phi = NULL,
prior.phi = NULL,
hyperparam.phi = NULL,
prior.psi = NULL,
hyperparam.psi = NULL,
n.chain = 1,
n.iter = 5000,
warmup.perc = 0.5,
thin = 1,
verbose = TRUE,
...
)
Arguments
formula |
an object of class " |
zero.formula |
an object of class " |
one.formula |
an object of class " |
data |
an optional |
type |
a character specifying the type of regression model. Current options are |
link.mu |
a character specifying the link function for the mean model (mu). Currently, |
prior.beta |
a character specifying the prior distribution for the regression coefficients of the mean model, |
hyperparam.beta |
a positive numeric (vector of length 1) specifying the hyperprior scale parameter for the prior distribution of |
prior.omega0 |
a character specifying the prior distribution for the regression coefficients of the augmented model in zero, |
hyperparam.omega0 |
a positive numeric (vector of length 1) specifying the hyperprior scale parameter for the prior distribution of |
prior.omega1 |
a character specifying the prior distribution for the regression coefficients of the augmented model in one, |
hyperparam.omega1 |
a positive numeric (vector of length 1) specifying the hyperprior scale parameter for the prior distribution of |
link.phi |
a character specifying the link function for the precision model (phi). Currently, |
prior.phi |
a character specifying the prior distribution for precision parameter |
hyperparam.phi |
a positive numeric (vector of length 1) specifying the hyperprior parameter for the prior distribution of |
prior.psi |
a character specifying the prior distribution for the regression coefficients of the precision model |
hyperparam.psi |
a positive numeric (vector of length 1) specifying the hyperprior scale parameter for the prior distribution of |
n.chain |
a positive integer specifying the number of Markov chains. The default is 1. |
n.iter |
a positive integer specifying the number of iterations for each chain (including warm-up). The default is 5000. |
warmup.perc |
the percentage of iterations per chain to discard. |
thin |
a positive integer specifying the period for saving samples. The default is 1. |
verbose |
a logical (with default |
... |
additional arguments from |
Details
Let Y be a continuous bounded random variable whose distribution can be specified in the type
argument and \mu
be the mean of Y.
The flexreg
function links the parameter \mu
to a linear predictor through a function g_1(\cdot)
specified in link.mu
:
g_1(\mu) = \bold{x}^t \bold{\beta},
where \bold{\beta}
is the vector of regression coefficients for the mean model.
The prior distribution and the related hyperparameter of \bold{\beta}
can be specified in prior.beta
and hyperparam.beta
, respectively.
By default, the precision parameter \phi
is assumed to be constant.
The prior distribution and the related hyperparameter of \phi
can be specified in prior.phi
and hyperparam.phi
.
It is possible to extend the model by linking \phi
to an additional (possibly overlapping) set of covariates through a proper link
function g_2(\cdot)
specified in the link.phi
argument:
g_2(\phi) = \bold{z}^t \bold{\psi},
where \bold{\psi}
is the vector of regression coefficients for the precision model.
The prior distribution and the related hyperparameter of \bold{\psi}
can be specified in prior.psi
and hyperparam.psi
.
In the function flexreg
, the regression model for the mean and, where appropriate, for the precision parameter can be specified in the
formula
argument with a formula of type y ~ x1 + x2 | z1 + z2
where covariates on the left of "|" are included in the regression model
for the mean, whereas covariates on the right of "|" are included in the regression model for the precision.
If the second part is omitted, i.e., y ~ x1 + x2
, the precision is assumed constant for each observation.
In presence of zero values in the response, one has to link the parameter q_0
, i.e., the probability of augmentation in zero, to an additional (possibly overlapping) set of covariates through a logit link function:
g_3(q_{0}) = \bold{x}_{0}^t \bold{\omega_0},
where \bold{\omega_0}
is the vector of regression coefficients for the augmented model in zero.
The prior distribution and the related hyperparameter of \bold{\omega_0}
can be specified in prior.omega0
and hyperparam.omega0
.
In presence of one values in the response, one has to link the parameter q_1
, i.e., the probability of augmentation in one, to an additional (possibly overlapping) set of covariates through a logit link function:
g_4(q_{1}) = \bold{x}_{1}^t \bold{\omega_1},
where \bold{\omega_1}
is the vector of regression coefficients for the augmented model in one.
The prior distribution and the related hyperparameter of \bold{\omega_1}
can be specified in prior.omega1
and hyperparam.omega1
.
If both the augmented models in zero and one are specified, the link function is a bivariate logit.
In flexreg
function, the augmented models in zero and/or one can be specified in the
zero.formula
and/or one.formula
arguments with a formula of type ~ x
.
Left hand side in zero.formula
and one.formula
can be omitted; if specified, they have to be the same as left hand side in formula
.
Value
The flexreg
function returns an object of class `flexreg`
, i.e. a list with the following elements:
call |
the function call. |
type |
the type of regression model. |
formula |
the overall formula. |
aug |
a character specifing the absence of the augmentation ( |
link.mu |
a character specifing the link function in the mean model. |
link.phi |
a character specifing the link function in the precision model. |
model |
an object of class |
response |
the response variable, assuming values in (0, 1). |
design.X |
the design matrix for the mean model. |
design.Z |
the design matrix for the precision model (if defined). |
design.X0 |
the design matrix for the augmented model in zero (if defined). |
design.X1 |
the design matrix for the augmented model in one (if defined). |
References
Di Brisco, A. M., Migliorati, S. (2020). A new mixed-effects mixture model for constrained longitudinal data. Statistics in Medicine, 39(2), 129–145. doi:10.1002/sim.8406
Di Brisco, A. M., Migliorati, S., Ongaro, A. (2020). Robustness against outliers: A new variance inflated regression model for proportions. Statistical Modelling, 20(3), 274–309. doi:10.1177/1471082X18821213
Ferrari, S.L.P., Cribari-Neto, F. (2004). Beta Regression for Modeling Rates and Proportions. Journal of Applied Statistics, 31(7), 799–815. doi:10.1080/0266476042000214501
Migliorati, S., Di Brisco, A. M., Ongaro, A. (2018) A New Regression Model for Bounded Responses. Bayesian Analysis, 13(3), 845–872. doi:10.1214/17-BA1079
Examples
## Not run:
data("Reading")
FB <- flexreg(accuracy.adj ~ iq, data = Reading, type="FB")
# Regression model with one augmentation:
AFB1 <- flexreg(accuracy ~ dyslexia | iq + dyslexia + iq:dyslexia,
one.formula = ~ iq + dyslexia, data = Reading, type="FB")
## End(Not run)
Flexible Regression Models for Bounded Discrete Responses
Description
The function fits some flexible regression models for bounded discrete responses via a Bayesian approach to inference based on Hamiltonian Monte Carlo algorithm.
Available regression models are the flexible beta-binomial (type = "FBB"
, default), the beta-binomial (type = "BetaBin"
), and the binomial one (type = "Bin"
).
Usage
flexreg_binom(
formula,
data,
type = "FBB",
n,
link.mu = "logit",
prior.beta = "normal",
hyperparam.beta = 100,
hyper.theta.a = NULL,
hyper.theta.b = NULL,
link.theta = NULL,
prior.psi = NULL,
hyperparam.psi = NULL,
n.chain = 1,
n.iter = 5000,
warmup.perc = 0.5,
thin = 1,
verbose = TRUE,
...
)
Arguments
formula |
an object of class " |
data |
an optional |
type |
a character specifying the type of regression model. Current options are |
n |
a character specifying the name of the variable containing the total number of trials. |
link.mu |
a character specifying the link function for the mean model. Currently, |
prior.beta |
a character specifying the prior distribution for the regression coefficients of the mean model, |
hyperparam.beta |
a positive numeric (vector of length 1) specifying the hyperprior scale parameter for the prior distribution of |
hyper.theta.a |
a numeric (vector of length 1) specifying the first shape parameter for the beta prior distribution of |
hyper.theta.b |
a numeric (vector of length 1) specifying the second shape parameter for the beta prior distribution of |
link.theta |
a character specifying the link function for the overdispersion model. Currently, |
prior.psi |
a character specifying the prior distribution for the regression coefficients of the overdispersion model, |
hyperparam.psi |
a positive numeric (vector of length 1) specifying the hyperprior scale parameter for the prior distribution of |
n.chain |
a positive integer specifying the number of Markov chains. The default is 1. |
n.iter |
a positive integer specifying the number of iterations for each chain (including warm-up). The default is 5000. |
warmup.perc |
the percentage of iterations per chain to discard. |
thin |
a positive integer specifying the period for saving samples. The default is 1. |
verbose |
a logical (with default |
... |
additional arguments from |
Details
Let Y be a random variable whose distribution can be specified in the type
argument and \mu
be the mean of Y/n.
The flexreg_binom
function links the parameter \mu
to a linear predictor through a function g_1(\cdot)
specified in link.mu
:
g_1(\mu) = x^t \bold{\beta},
where \bold{\beta}
is the vector of regression coefficients for the mean model.
The prior distribution and the related hyperparameter of \bold{\beta}
can be specified in prior.beta
and hyperparam.beta
.
By default, link.theta="identity"
, meaning that the overdispersion parameter \theta
is assumed to be constant.
In that case, the prior distribution for \theta
is a beta with shape hyperparameters a
and b
that can be specified in hyper.theta.a
and hyper.theta.b
.
If not specified, a=b=1
, otherwise if only one hyperparameter is specified, the other is set equal.
It is possible to extend the model by linking \theta
to an additional (possibly overlapping) set of covariates through a proper link
function g_2(\cdot)
specified in the link.theta
argument:
g_2(\theta) = z^t \bold{\psi},
where \bold{\psi}
is the vector of regression coefficients for the overdispersion model.
The prior distribution and the related hyperparameter of \bold{\psi}
can be specified in prior.psi
and hyperparam.psi
.
In flexreg_binom
, the regression model for the mean and, where appropriate, for the overdispersion parameter can be specified in the
formula
argument with a formula of type y ~ x1 + x2 | z1 + z2
where covariates on the left of "|" are included in the regression model
for the mean, whereas covariates on the right of "|" are included in the regression model for the overdispersion.
If the second part is omitted, i.e., y ~ x1 + x2
, the overdispersion is assumed constant for each observation.
Value
The flexreg_binom
function returns an object of class `flexreg`
, i.e. a list with the following elements:
call |
the function call. |
type |
the type of regression model. |
formula |
the original formula. |
link.mu |
a character specifing the link function in the mean model. |
link.theta |
a character specifing the link function in the overdispersion model. |
model |
an object of class |
response |
the response variable, assuming values in (0, 1). |
design.X |
the design matrix for the mean model. |
design.Z |
the design matrix for the overdispersion model (if defined). |
References
Ascari, R., Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895–3914. doi:10.1002/sim.9005
Examples
## Not run:
data(Bacteria)
fbb <- flexreg_binom(y ~ females, n = "n", data = Bacteria, type = "FBB")
## End(Not run)
mu.chain.nd
Description
mu.chain.nd
Usage
mu.chain.nd(posterior, newdata.X, link.mu)
newdata.adjust
Description
newdata.adjust
Usage
newdata.adjust(newdata, formula)
phi.chain.nd
Description
phi.chain.nd
Usage
phi.chain.nd(posterior, newdata.Z, link.phi)
Plot Method for flexreg
Objects
Description
Method for plotting regression curves for the mean from fitted regression model objects of class `flexreg`
.
Usage
## S3 method for class 'flexreg'
plot(
x,
name.x,
additional.cov.default = NA,
smooth = TRUE,
cluster = FALSE,
type = "response",
...
)
Arguments
x |
an object of class |
name.x |
a character containing the name of the covariate from the mean model to be plotted on the x-axis of the scatterplot. |
additional.cov.default |
a list of additional covariates from the mean model and their value to be set as default. |
smooth |
a logical value indicating wheater the curves should be smooth ( |
cluster |
logical. If the model is |
type |
a vector of characters indicating the regression curves to be plotted. Available options are |
... |
additional arguments. Currently not used. |
Details
The function produces a scatterplot of the covariate from the mean model specified in name.x
and y
or y/n
if the response is bounded continuous or discrete, respectively. Any other variable specified in the mean model must be set to a default through the additional.cov.default
argument.
The argument type = "response"
plots the conditional mean curve (i.e., \mu
), whereas the argument type = "response.aug"
, available only for augmented models,
plots the augmented mean curve.
If the regression model is of "FB"
or "FBB"
type and cluster = TRUE
, then the function returns two additional curves corresponding to the component means, i.e., \lambda_1
and \lambda_2
.
Examples
## Not run:
data("Reading")
FB <- flexreg(accuracy.adj ~ iq + dyslexia, data = Reading)
plot(FB, name.x="iq", additional.cov.default = list("dyslexia"=1))
## End(Not run)
Plot Method for `flexreg_postpred`
objects
Description
Method for an object of class `flexreg_postpred`
containing the simulated posterior predictive distribution, usually the result of posterior_predict
function. The plot shows the posterior predictive interval for each statistical unit.
Additionally, the mean of the posterior predictives and the values of the observed response (either y
or y/n
for bounded continuous or discrete responses, respectively) can be added.
Usage
## S3 method for class 'flexreg_postpred'
plot(x, prob = 0.9, p_mean = F, response = NULL, ...)
Arguments
x |
an object of class |
prob |
the interval probability for the posterior predictives (default is 0.9). |
p_mean |
a logical value indicating whether the posterior predictives' mean should be plotted. |
response |
a numerical vector containing the response (either |
... |
additional arguments. Currently not used. |
Examples
## Not run:
data("Reading")
FB <- flexreg(accuracy ~ iq, data = Reading)
pp <- posterior_predict(FB)
plot(pp)
## End(Not run)
posterior_predict
Description
posterior_predict
Usage
posterior_predict(model, newdata = NULL, n.new = NULL)
Posterior Predictive Method for `flexreg`
objects
Description
The function takes an object of class `flexreg`
and generates values from the posterior predictive distribution.
Usage
## S3 method for class 'flexreg'
posterior_predict(model, newdata = NULL, n.new = NULL)
Arguments
model |
an object of class |
newdata |
an optional |
n.new |
an optional vector containing the total number of trials with which to predict. It must be specified if |
Details
The function generates values from the posterior predictive distribution, which is the distribution of a future outcome given the observed data.
The posterior predictive distribution is computed for y
in case of bounded continuous responses and
for y/n
in case of bounded discrete responses.
Value
An object of class `flexreg_postpred`
containing a matrix with the simulated posterior predictions. Each column refers to a statistical unit to predict.
References
Ascari, R., Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895–3914. doi:10.1002/sim.9005
Di Brisco, A. M., Migliorati, S., Ongaro, A. (2020). Robustness against outliers: A new variance inflated regression model for proportions. Statistical Modelling, 20(3), 274–309. doi:10.1177/1471082X18821213
Gelman, A., Carlin, J. B., Stern, H. S., Rubin, D. B. (2014). Bayesian Data Analysis, 3th edition. Chapman and Hall/CRC. doi:10.1201/b16018
Migliorati, S., Di Brisco, A. M., Ongaro, A. (2018). A New Regression Model for Bounded Responses. Bayesian Analysis, 13(3), 845–872. doi:10.1214/17-BA1079
Examples
## Not run:
data("Reading")
FB <- flexreg(accuracy.adj ~ iq, data = Reading, n.iter=1000)
pp <- posterior_predict(FB)
plot(pp)
## End(Not run)
Predict Method for `flexreg`
Objects
Description
Method that computes various types of predictions from objects of class `flexreg`
.
Usage
## S3 method for class 'flexreg'
predict(
object,
newdata = NULL,
n.new = NULL,
cluster = FALSE,
type = "response",
estimate = "mean",
q = NULL,
...
)
Arguments
object |
an object of class |
newdata |
an optional |
n.new |
an optional vector containing the total number of trials with which to predict. It must be specified if |
cluster |
a logical (with default |
type |
a character indicating the type of prediction. Available options are: |
estimate |
a character indicating the type of estimate. Available options are |
q |
if |
... |
additional arguments. Currently not used. |
Details
The predict
method computes various types of predictions from objects of class `flexreg`
.
If type = "response"
, the function returns the marginal mean, i.e., \mu
.
In case of models for continuous bounded responses with augmentation, the function returns also the overall mean
q_1+(1-q_0-q_1)\mu
and the probabilities of augmentation q_0
and/or q_1
.
If type = "variance"
, the function returns Var(Y|0<Y<1)
in case of no augmentation and
(1-q_0-q_1)Var(Y|0<Y<1)+q_1^2+(1-q_0-q_1)\mu^2-(q_1+(1-q_0-q_1)\mu)^2
in case of augmentation.
If cluster = TRUE
, for FB and FBB models, the function returns the cluster means (\lambda_1
and \lambda_2
) when type = "response"
and the cluster variances when type = "variance"
.
The option type = "overdispersion"
is available only for beta-binomial and flexible beta-binomial models and returns the fitted overdispersion.
Value
The function returns a data.frame
of different dimensions depending on the type of prediction.
References
Ascari, R., Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895–3914. doi:10.1002/sim.9005
Di Brisco, A. M., Migliorati, S. (2020). A new mixed-effects mixture model for constrained longitudinal data. Statistics in Medicine, 39(2), 129–145. doi:10.1002/sim.8406
Migliorati, S., Di Brisco, A. M., Ongaro, A. (2018). A New Regression Model for Bounded Responses. Bayesian Analysis, 13(3), 845–872. doi:10.1214/17-BA1079
Examples
## Not run:
data("Reading")
FB <- flexreg(accuracy.adj ~ iq, data=Reading, type="FB")
predict(FB, type="response", cluster=TRUE)
## End(Not run)
predict_lambda.chain
Description
predict_lambda.chain
Usage
predict_lambda.chain(posterior, mu.chain, newdata)
predict_link
Description
predict_link
Usage
predict_link(model, posterior, newdata)
predict_mu.chain
Description
predict_mu.chain
Usage
predict_mu.chain(model, posterior, newdata)
predict_over
Description
predict_over
Usage
predict_over(model, posterior, newdata)
predict_precision
Description
predict_precision
Usage
predict_precision(model, posterior, newdata)
predict_q.chain
Description
predict_q.chain
Usage
predict_q.chain(model, posterior, newdata = NULL)
predict_response
Description
predict_response
Usage
predict_response(model, posterior, newdata, cluster, n)
predict_variance
Description
predict_variance
Usage
predict_variance(
model,
posterior,
newdata,
cluster,
cluster.var = T,
model.type,
model.class,
n
)
Print Methods for flexreg Objects
Description
Print Methods for flexreg Objects
Usage
## S3 method for class 'flexreg'
print(x, ...)
Arguments
x |
an object of class |
... |
additional arguments. Currently not used. |
q0.chain.nd
Description
q0.chain.nd
Usage
q0.chain.nd(posterior, newdata.X0)
q01.chain.nd
Description
q01.chain.nd
Usage
q01.chain.nd(posterior, newdata.X0, newdata.X1)
q1.chain.nd
Description
q1.chain.nd
Usage
q1.chain.nd(posterior, newdata.X1)
Random generator from the beta distribution
Description
The function generates random values from the beta distribution with a mean-precision parameterization, or from the augmented beta distribution.
Usage
rBeta(n, mu, phi, q0 = NULL, q1 = NULL)
Arguments
n |
the number of values to generate. If |
mu |
the mean parameter. It must lie in (0, 1). |
phi |
the precision parameter. It must be a real positive value. |
q0 |
the probability of augmentation in zero. It must lie in (0, 1). In case of no augmentation, it is |
q1 |
the probability of augmentation in one. It must lie in (0, 1). In case of no augmentation, it is |
Value
A vector of length n
.
References
Ferrari, S.L.P., Cribari-Neto, F. (2004). Beta Regression for Modeling Rates and Proportions. Journal of Applied Statistics, 31(7), 799–815. doi:10.1080/0266476042000214501
Examples
rBeta(n = 100, mu = .5, phi = 30)
rBeta(n = 100, mu = .5, phi = 30, q0 = .2, q1 = .1)
Random generator from the beta-binomial distribution
Description
The function generates random values from the beta-binomial distribution.
Usage
rBetaBin(n, size = NULL, mu = NULL, theta = NULL, phi = NULL)
Arguments
n |
the number of values to generate. If |
size |
the total number of trials. |
mu |
the mean parameter. It must lie in (0, 1). |
theta |
the overdispersion parameter. It must lie in (0, 1). |
phi |
the precision parameter, an alternative way to specify the overdispersion parameter |
Value
A vector of length n
.
References
Ascari, R., Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895–3914. doi:10.1002/sim.9005
Examples
rBetaBin(n = 100, size = 40, mu = .5, theta = .4)
rBetaBin(n = 100, size = 40, mu = .5, phi = 1.5)
Random generator from the flexible beta distribution
Description
The function generates random values from the flexible beta distribution, or from the augmented flexible beta distribution.
Usage
rFB(n, mu, phi, p, w, q0 = NULL, q1 = NULL)
Arguments
n |
the number of values to generate. If |
mu |
the mean parameter. It must lie in (0, 1). |
phi |
the precision parameter. It must be a real positive value. |
p |
the mixing weight. It must lie in (0, 1). |
w |
the normalized distance among clusters. It must lie in (0, 1). |
q0 |
the probability of augmentation in zero. It must lie in (0, 1). In case of no augmentation, it is |
q1 |
the probability of augmentation in one. It must lie in (0, 1). In case of no augmentation, it is |
Value
A vector of length n
.
References
Di Brisco, A. M., Migliorati, S. (2020). A new mixed-effects mixture model for constrained longitudinal data. Statistics in Medicine, 39(2), 129–145. doi:10.1002/sim.8406
Migliorati, S., Di Brisco, A. M., Ongaro, A. (2018). A New Regression Model for Bounded Responses. Bayesian Analysis, 13(3), 845–872. doi:10.1214/17-BA1079
Examples
rFB(n = 100, mu = .5, phi = 30,p = .3, w = .6)
rFB(n = 100, mu = .5, phi = 30,p = .3, w = .6, q0 = .2, q1 = .1)
Random generator from the flexible beta-binomial distribution
Description
The function generates random values from the flexible beta-binomial distribution.
Usage
rFBB(n, size = NULL, mu, theta = NULL, phi = NULL, p, w)
Arguments
n |
the number of values to generate. If |
size |
the total number of trials. |
mu |
the mean parameter. It must lie in (0, 1). |
theta |
the overdispersion parameter. It must lie in (0, 1). |
phi |
the precision parameter, an alternative way to specify the overdispersion parameter |
p |
the mixing weight. It must lie in (0, 1). |
w |
the normalized distance among clusters. It must lie in (0, 1). |
Value
A vector of length n
.
References
Ascari, R., Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895–3914. doi:10.1002/sim.9005
Examples
rFBB(n = 100, size = 40, mu = .5, theta = .4, p = .3, w = .6)
rFBB(n = 100, size = 40, mu = .5, phi = 1.5, p = .3, w = .6)
Random generation from the variance-inflated beta distribution
Description
The function generates random values from the variance-inflated beta distribution, or from the augmented variance-inflated beta distribution.
Usage
rVIB(n, mu, phi, p, k, q0 = NULL, q1 = NULL)
Arguments
n |
the number of values to generate. If |
mu |
the mean parameter. It must lie in (0, 1). |
phi |
the precision parameter. It must be a real positive value. |
p |
the mixing weight. It must lie in (0, 1). |
k |
the extent of the variance inflation. It must lie in (0, 1). |
q0 |
the probability of augmentation in zero. It must lie in (0, 1). In case of no augmentation, it is |
q1 |
the probability of augmentation in one. It must lie in (0, 1). In case of no augmentation, it is |
Value
A vector of length n
.
References
Di Brisco, A. M., Migliorati, S., Ongaro, A. (2020). Robustness against outliers: A new variance inflated regression model for proportions. Statistical Modelling, 20(3), 274–309. doi:10.1177/1471082X18821213
Examples
rVIB(n = 100, mu = .5, phi = 30, p = .3, k = .6)
rVIB(n = 100, mu = .5, phi = 30, p = .3, k = .6, q0 = .2, q1 = .1)
rate_plot
Description
rate_plot
Usage
rate_plot(chains, pars, n.warmup = n.warmup)
Residuals Method for flexreg Objects
Description
Method that computes various types of residuals from objects of class `flexreg`
. If the model type is FB
or FBB
and cluster = TRUE
, the method returns also residuals with respect to cluster means.
Usage
## S3 method for class 'flexreg'
residuals(
object,
type = "raw",
cluster = FALSE,
estimate = "mean",
q = NULL,
...
)
Arguments
object |
an object of class |
type |
a character indicating type of residuals ( |
cluster |
logical. If the model is |
estimate |
a character indicating the type of estimate: |
q |
if |
... |
additional arguments. Currently not used. |
Details
The residuals
method computes raw and standardized residuals from objects of class `flexreg`
.
Raw residuals are defined as r=y-\hat{\mu}
for bounded continuous responses or as r= y/n-\hat{\mu}
for bounded discrete responses.
Values y
and y/n
are the observed
responses which are specified on the left-hand side of formula
in the
flexreg
and flexreg_binom
functions, respectively.
Moreover, \hat{\mu}
is the predicted value, the result of
the predict
function with type = "response"
.
Standardized residuals are defined as \frac{r}{\sqrt{\widehat{Var}(y)}}
where
\widehat{Var}(y)
is the variance of the response evaluated at the posterior means
–by default, otherwise evaluated at the posterior quantiles of order q
– of the parameters.
If the model is "FB"
or "FBB"
, type = "raw"
, and cluster = TRUE
, the cluster raw residuals are computed as
the difference between the observed response/relative response and the cluster means, i.e.,
\hat{\lambda}_{1}
and \hat{\lambda}_{2}
.
If the model is "FB"
or "FBB"
, type = "standardized"
and cluster = TRUE
, the cluster standardized residuals are computed as the
cluster raw residuals divided by the square root of the cluster variances.
Cluster residuals, either raw or standardized, can be used for classification purpose. Indeed, with cluster = TRUE
the residuals
method returns also a column named
"label"
assigning values 1 or 2 to observations depending on whether they are classified in cluster 1 (if the corresponding cluster residual is smaller) or in cluster 2.
Value
The method returns an array with as many rows as the number of observations in the sample. If cluster = FALSE
, the array has only one column containing either the raw or standardized residuals.
If cluster = TRUE
, the array has four columns: the first column contains the raw or standardized residuals, the second and third columns contain the cluster residuals,
and the fourth column contains the classification labels (see Details).
References
Ascari, R., Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895–3914. doi:10.1002/sim.9005
Di Brisco, A. M., Migliorati, S. (2020). A new mixed-effects mixture model for constrained longitudinal data. Statistics in Medicine, 39(2), 129–145. doi:10.1002/sim.8406
Migliorati, S., Di Brisco, A. M., Ongaro, A. (2018). A New Regression Model for Bounded Responses. Bayesian Analysis, 13(3), 845–872. doi:10.1214/17-BA1079
Examples
## Not run:
data("Reading")
FB <- flexreg(accuracy.adj ~ iq, data=Reading, type="FB")
residuals(FB, type="raw", cluster=TRUE)
## End(Not run)
Methods for `flexreg`
Objects
Description
Methods for extracting information from fitted regression model objects of class `flexreg`
.
Usage
## S3 method for class 'flexreg'
summary(object, ..., digits = 4)
## S3 method for class 'summary.flexreg'
print(x, ...)
## S3 method for class 'flexreg'
coef(object, ...)
Arguments
object |
an object of class |
... |
additional arguments. Currently not used. |
digits |
an integer indicating the number of decimal places. Default equal to 4. |
x |
an object of class |
Details
The summary.flexreg
method summarizes the results of flexreg
and flexreg_binom
functions, adding also information from the functions
residuals.flexreg
and WAIC
. The summary.flexreg
method returns an object of class `summary.flexreg`
containing the relevant summary statistics which can subsequently be
printed using the associated print.summary.flexreg
method.
Examples
## Not run:
data("Reading")
FB <- flexreg(accuracy.adj ~ iq, data = Reading, n.iter = 1000)
summary(FB)
## End(Not run)
Summary Method for `flexreg_postpred`
objects
Description
Summary method for an object of class `flexreg_postpred`
, containing the simulated posterior predictive distribution.
Usage
## S3 method for class 'flexreg_postpred'
summary(object, ...)
Arguments
object |
an object of class |
... |
additional arguments. |
Value
The function summary.flexreg_postpred
returns an array with the statistical units by row. The number of rows of the array is equal to the number of columns of the object of class `flexreg_postpred`
that is given to the function.
By column there are some synthesis values that are the minimum, the first quartile, the media, the mean, the third quartile, and the maximum.
Examples
## Not run:
data("Reading")
FB <- flexreg(accuracy ~ iq, data = Reading)
pp <- posterior_predict(FB)
summary(pp)
## End(Not run)
theta.chain.nd
Description
theta.chain.nd
Usage
theta.chain.nd(posterior, newdata.theta, link.theta)
var.fun
Description
var.fun
Usage
var.fun(
model.class,
model.type,
posterior,
mu.chain,
phi.chain,
theta.chain,
q0.chain,
q1.chain,
l1.chain,
l2.chain,
n
)