Title: | An Implementation of Z-Curves |
Version: | 2.4.3 |
Maintainer: | František Bartoš <f.bartos96@gmail.com> |
Description: | An implementation of z-curves - a method for estimating expected discovery and replicability rates on the bases of test-statistics of published studies. The package provides functions for fitting the density, EM, and censored EM version (Bartoš & Schimmack, 2022, <doi:10.15626/MP.2021.2720>; Schimmack & Bartoš, 2023, <doi:10.1371/journal.pone.0290084>), as well as the original density z-curve (Brunner & Schimmack, 2020, <doi:10.15626/MP.2018.874>). Furthermore, the package provides summarizing and plotting functions for the fitted z-curve objects. See the aforementioned articles for more information about the z-curves, expected discovery and replicability rates, validation studies, and limitations. |
License: | GPL-3 |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.3.2 |
Imports: | Rcpp (≥ 1.0.2), nleqslv, stats, evmix, graphics, ggplot2, Rdpack, rlang |
LinkingTo: | Rcpp |
Suggests: | parallel, spelling, testthat, vdiffr |
Language: | en-US |
RdMacros: | Rdpack |
URL: | https://fbartos.github.io/zcurve/ |
BugReports: | https://github.com/FBartos/zcurve/issues |
NeedsCompilation: | yes |
Packaged: | 2025-05-16 20:40:58 UTC; fbart |
Author: | František Bartoš [aut, cre], Ulrich Schimmack [aut] |
Repository: | CRAN |
Date/Publication: | 2025-05-16 21:10:02 UTC |
zcurve: An Implementation of Z-Curves
Description
An implementation of z-curves - a method for estimating expected discovery and replicability rates on the bases of test-statistics of published studies. The package provides functions for fitting the density, EM, and censored EM version (Bartoš & Schimmack, 2022, doi:10.15626/MP.2021.2720; Schimmack & Bartoš, 2023, doi: 10.1371/journal.pone.0290084), as well as the original density z-curve (Brunner & Schimmack, 2020, doi:10.15626/MP.2018.874). Furthermore, the package provides summarizing and plotting functions for the fitted z-curve objects. See the aforementioned articles for more information about the z-curves, expected discovery and replicability rates, validation studies, and limitations.
Author(s)
Maintainer: František Bartoš f.bartos96@gmail.com
Authors:
Ulrich Schimmack ulrich.schimmack@utoronto.ca
See Also
Useful links:
Z-scores from subset of original studies featured in OSC 2015 reproducibility project
Description
The dataset contains z-scores from subset of original studies featured in psychology reproducibility project (Collaboration and others 2015). Only z-scores from studies with unambiguous original outcomes are supplied (eliminating 7 studies with marginally significant results). The real replication rate for those studies is 35/90 (the whole project reports 36/97).
Usage
OSC.z
Format
A vector with 90 observations
References
Collaboration OS, others (2015). “Estimating the reproducibility of psychological science.” Science, 349(6251). doi:10.1126/science.aac4716.
Control settings for the zcurve EM algorithm
Description
All these settings are passed to the Expectation Maximization
fitting algorithm. All unspecified settings are set to the default value.
Setting model = "EM"
sets all settings to the default
value irrespective of any other setting and fits z-curve as described in
Bartoš and Schimmack (2022)
Arguments
model |
A type of model to be fitted, defaults to |
sig_level |
An alpha level of the test statistics, defaults to
|
a |
A beginning of fitting interval, defaults to
|
b |
An end of fitting interval, defaults to |
mu |
Means of the components, defaults to
|
sigma |
A standard deviation of the components, defaults to
|
theta_alpha |
A vector of alpha parameters of a Dirichlet distribution
for generating random starting values for the weights, defaults to
|
theta_max |
Upper limits for weights, defaults to
|
criterion |
A criterion to terminate the EM algorithm,
defaults to |
criterion_start |
A criterion to terminate the starting phase
of the EM algorithm, defaults to |
criterion_boot |
A criterion to terminate the bootstrapping phase
of the EM algorithm, defaults to |
max_iter |
A maximum number of iterations of the EM algorithm
(not including the starting iterations) defaults to |
max_iter_start |
A maximum number of iterations for the
starting phase of EM algorithm, defaults to |
max_iter_boot |
A maximum number of iterations for the
booting phase of EM algorithm, defaults to |
fit_reps |
A number of starting fits to get the initial
position for the EM algorithm, defaults to |
References
Bartoš F, Schimmack U (2022). “Z-curve 2.0: Estimating replication rates and discovery rates.” Meta-Psychology, 6. doi:10.15626/MP.2021.2720.
See Also
Examples
# to increase the number of starting fits
# and change the means of the mixture components
ctrl <- list(
fit_reps = 50,
mu = c(0, 1.5, 3, 4.5, 6)
)
## Not run: zcurve(OSC.z, method = "EM", control = ctrl)
Control settings for the z-curve 2.0 density algorithm
Description
All settings are passed to the density fitting
algorithm. All unspecified settings are set to the default value.
Setting model = "KD2"
sets all settings to the default
value irrespective of any other setting and fits z-curve as
describe in Bartoš and Schimmack (2022). In order to fit the
z-curve 1.0 density algorithm, set model = "KD1"
and go to
control_density_v1
Arguments
version |
Which version of z-curve should be fitted. Defaults to
|
model |
A type of model to be fitted, defaults to |
sig_level |
An alpha level of the test statistics, defaults to
|
a |
A beginning of fitting interval, defaults to
|
b |
An end of fitting interval, defaults to |
mu |
Means of the components, defaults to |
sigma |
A standard deviation of the components, "Don't touch this"
\- Ulrich Schimmack, defaults to |
theta_min |
Lower limits for weights, defaults to
|
theta_max |
Upper limits for weights, defaults to
|
max_iter |
A maximum number of iterations for the nlminb
optimization for fitting mixture model, defaults to |
max_eval |
A maximum number of evaluation for the nlminb
optimization for fitting mixture model, defaults to |
criterion |
A criterion to terminate nlminb optimization,
defaults to |
bw |
A bandwidth of the kernel density estimation, defaults to |
aug |
Augment truncated kernel density, defaults to |
aug.bw |
A bandwidth of the augmentation, defaults to |
n.bars |
A resolution of density function, defaults to |
density_dbc |
Use bckden to estimate a truncated kernel density,
defaults to |
compute_FDR |
Whether to compute FDR, leads to noticeable increase in
computation, defaults to |
criterion_FDR |
A criterion for estimating the maximum FDR, defaults
to |
criterion_FDR_dbc |
A criterion for estimating the maximum FDR using
the bckden function, defaults to |
precision_FDR |
A maximum FDR precision, defaults to |
References
Bartoš F, Schimmack U (2022). “Z-curve 2.0: Estimating replication rates and discovery rates.” Meta-Psychology, 6. doi:10.15626/MP.2021.2720.
See Also
zcurve()
, control_density_v1, control_EM
Examples
# to decrease the criterion and increase the number of iterations
ctrl <- list(
max_iter = 300,
criterion = 1e-4
)
## Not run: zcurve(OSC.z, method = "density", control = ctrl)
Control settings for the original z-curve density algorithm
Description
All settings are passed to the density fitting
algorithm. All unspecified settings are set to the default value.
Setting model = "KD1"
sets all settings to the default
value irrespective of any other setting and fits z-curve as described
in Brunner and Schimmack (2020).
Arguments
version |
Set to |
model |
A type of model to be fitted, defaults to |
sig_level |
An alpha level of the test statistics, defaults to
|
a |
A beginning of fitting interval, defaults to
|
b |
An end of fitting interval, defaults to |
K |
Number of mixture components, defaults to |
max_iter |
A maximum number of iterations for the nlminb
optimization for fitting mixture model, defaults to |
max_eval |
A maximum number of evaluation for the nlminb
optimization for fitting mixture model, defaults to |
criterion |
A criterion to terminate nlminb optimization,
defaults to |
bw |
A bandwidth of the kernel density estimation, defaults to |
References
Brunner J, Schimmack U (2020). “Estimating population mean power under conditions of heterogeneity and selection for significance.” Meta-Psychology, 4. doi:10.15626/MP.2018.874.
See Also
zcurve()
, control_density, control_EM
Examples
# to increase the number of iterations
ctrl <- list(
version = 1,
max_iter = 300
)
## Not run: zcurve(OSC.z, method = "density", control = ctrl)
Prints first few rows of a z-curve data object
Description
Prints first few rows of a z-curve data object
Usage
## S3 method for class 'zcurve_data'
head(x, ...)
Arguments
x |
z-curve data object |
... |
Additional arguments |
See Also
Reports whether x is a zcurve object
Description
Reports whether x is a zcurve object
Usage
is.zcurve(x)
Arguments
x |
an object to test |
Plot fitted z-curve object
Description
Plot fitted z-curve object
Usage
## S3 method for class 'zcurve'
plot(
x,
annotation = FALSE,
CI = FALSE,
extrapolate = FALSE,
plot_type = "base",
y.anno = c(0.95, 0.88, 0.78, 0.71, 0.61, 0.53, 0.43, 0.35),
x.anno = 0.6,
cex.anno = 1,
...
)
Arguments
x |
Fitted z-curve object |
annotation |
Add annotation to the plot. Defaults
to |
CI |
Plot confidence intervals for the estimated z-curve. Defaults
to |
extrapolate |
Scale the chart to the extrapolated area. Defaults
to |
plot_type |
Type of plot to by produced. Defaults to |
y.anno |
A vector of length 8 specifying the y-positions
of the individual annotation lines relative to the figure's height.
Defaults to |
x.anno |
A number specifying the x-position of the block of annotations relative to the figure's width. |
cex.anno |
A number specifying the size of the annotation text. |
... |
Additional arguments including |
See Also
Examples
## Not run:
# simulate some z-statistics and fit a z-curve
z <- abs(rnorm(300,3))
m.EM <- zcurve(z, method = "EM", bootstrap = 100)
# plot the z-curve
plot(m.EM)
# add annotation text and model fit CI
plot(m.EM, annotation = TRUE, CI = TRUE)
# change the location of the annotation to the left
plot(m.EM, annotation = TRUE, CI = TRUE, x_text = 0)
## End(Not run)
Compute z-score corresponding to a power
Description
A function for computing z-scores of two-sided tests
corresponding to power power
for a given significance level
alpha alpha
(or corresponding cut-off z-statistic a
).
Usage
power_to_z(
power,
alpha = 0.05,
a = stats::qnorm(alpha/2, lower.tail = FALSE),
two.sided = TRUE,
nleqslv_control = list(xtol = 1e-15, maxit = 300, stepmax = 0.5)
)
Arguments
power |
A vector of powers |
alpha |
Level of significance alpha |
a |
Or, alternatively a z-score corresponding to |
two.sided |
Whether directionality of the effect size should be taken into account. |
nleqslv_control |
A named list of control parameters passed to the nleqslv function used for solving the inverse of z_to_power function. |
Examples
# z-scores corresponding to the (aproximate) power of components of EM2
power_to_z(c(0.05, 0.20, 0.40, 0.60, 0.80, 0.974, 0.999), alpha = .05)
Prints estimates from z-curve object
Description
Prints estimates from z-curve object
Usage
## S3 method for class 'zcurve'
print.estimates(x, ...)
Arguments
x |
Estimate of a z-curve object |
... |
Additional arguments |
See Also
Prints summary object for z-curve method
Description
Prints summary object for z-curve method
Usage
## S3 method for class 'zcurve'
print.summary(x, ...)
Arguments
x |
Summary of a z-curve object |
... |
Additional arguments |
See Also
Prints a fitted z-curve object
Description
Prints a fitted z-curve object
Usage
## S3 method for class 'zcurve'
print(x, ...)
Arguments
x |
Fitted z-curve object |
... |
Additional arguments |
See Also
Prints a z-curve data object
Description
Prints a z-curve data object
Usage
## S3 method for class 'zcurve_data'
print(x, ...)
Arguments
x |
z-curve data object |
... |
Additional arguments |
See Also
Summarize fitted z-curve object
Description
Summarize fitted z-curve object
Usage
## S3 method for class 'zcurve'
summary(
object,
type = "results",
all = FALSE,
ERR.adj = 0.03,
EDR.adj = 0.05,
round.coef = 3,
conf.level = 0.95,
...
)
Arguments
object |
A fitted z-curve object. |
type |
Whether the results |
all |
Whether additional results, such as file drawer
ration, expected and missing number of studies, and Soric FDR
be returned. Defaults to |
ERR.adj |
Confidence intervals adjustment for ERR. Defaults
to |
EDR.adj |
Confidence intervals adjustment for EDR. Defaults
to |
round.coef |
To how many decimals should the coefficient
be rounded. Defaults to |
conf.level |
Confidence level for the confidence intervals. Note
that the |
... |
Additional arguments |
Value
Summary of a z-curve object
See Also
Compute power corresponding to z-scores
Description
A function for computing power of two-sided tests
corresponding to z-scores for a given significance level.
alpha
(or corresponding cut-off z-score a
)
Usage
z_to_power(
z,
alpha = 0.05,
a = stats::qnorm(alpha/2, lower.tail = FALSE),
two.sided = TRUE
)
Arguments
z |
A vector of z-scores |
alpha |
Level of significance alpha |
a |
Or, alternatively a z-score corresponding to |
two.sided |
Whether directionality of the effect size should be taken into account. |
Examples
# mean powers corresponding to the mean components of KD2
z_to_power(0:6, alpha = .05)
Fit a z-curve
Description
zcurve
is used to fit z-curve models. The function
takes input of z-statistics or two-sided p-values and returns object of
class "zcurve"
that can be further interrogated by summary and plot
function. It default to EM model, but different version of z-curves can
be specified using the method
and control
arguments. See
'Examples' and 'Details' for more information.
Usage
zcurve(
z,
z.lb,
z.ub,
p,
p.lb,
p.ub,
data,
method = "EM",
bootstrap = 1000,
parallel = FALSE,
control = NULL
)
Arguments
z |
a vector of z-scores. |
z.lb |
a vector with start of censoring intervals of censored z-scores. |
z.ub |
a vector with end of censoring intervals of censored z-scores. |
p |
a vector of two-sided p-values, internally transformed to z-scores. |
p.lb |
a vector with start of censoring intervals of censored two-sided p-values. |
p.ub |
a vector with end of censoring intervals of censored two-sided p-values. |
data |
an object created with |
method |
the method to be used for fitting. Possible options are
Expectation Maximization |
bootstrap |
the number of bootstraps for estimating CI. To skip
bootstrap specify |
parallel |
whether the bootstrap should be performed in parallel.
Defaults to |
control |
additional options for the fitting algorithm more details in control EM or control density. |
Details
The function returns the EM method by default and changing
method = "density"
gives the KD2 version of z-curve as outlined in
Bartoš and Schimmack (2022). For the original z-curve
(Brunner and Schimmack 2020), referred to as KD1, specify
'control = "density", control = list(model = "KD1")'
. Specifying
the lower and upper bounds of z-scores or p-values will fit the censored
version of z-curve described in (Schimmack and Bartoš 2023).
Value
The fitted z-curve object
References
Bartoš F, Schimmack U (2022).
“Z-curve 2.0: Estimating replication rates and discovery rates.”
Meta-Psychology, 6.
doi:10.15626/MP.2021.2720.
Brunner J, Schimmack U (2020).
“Estimating population mean power under conditions of heterogeneity and selection for significance.”
Meta-Psychology, 4.
doi:10.15626/MP.2018.874.
Schimmack U, Bartoš F (2023).
“Estimating the false discovery risk of (randomized) clinical trials in medical journals based on published p-values.”
PLoS ONE, 18(8), e0290084.
doi:10.1371/journal.pone.0290084.
See Also
summary.zcurve()
, plot.zcurve()
, control_EM, control_density
Examples
# load data from OSC 2015 reproducibility project
OSC.z
# fit an EM z-curve (with disabled bootstrap due to examples times limits)
m.EM <- zcurve(OSC.z, method = "EM", bootstrap = FALSE)
# a version with 1000 boostraped samples would looked like:
m.EM <- zcurve(OSC.z, method = "EM", bootstrap = 1000)
# or KD2 z-curve (use larger bootstrap for real inference)
m.D <- zcurve(OSC.z, method = "density", bootstrap = FALSE)
# inspect the results
summary(m.EM)
summary(m.D)
# see '?summary.zcurve' for more output options
# plot the results
plot(m.EM)
plot(m.D)
# see '?plot.zcurve' for more plotting options
# to specify more options, set the control arguments
# ei. increase the maximum number of iterations and change alpha level
ctr1 <- list(
"max_iter" = 9999,
"alpha" = .10
)
## Not run: m1.EM <- zcurve(OSC.z, method = "EM", bootstrap = FALSE, control = ctr1)
# see '?control_EM' and '?control_density' for more information about different
# z-curves specifications
z-curve estimates
Description
The following functions extract estimates from the z-curve object.
Usage
ERR(object, round.coef = 3)
EDR(object, round.coef = 3)
ODR(object, round.coef = 3)
Soric(object, round.coef = 3)
file_drawer_ration(object, round.coef = 3)
expected_n(object, round.coef = 0)
missing_n(object, round.coef = 0)
significant_n(object)
included_n(object)
Arguments
object |
the z-curve object |
round.coef |
rounding for the printed values |
Details
Technically, ODR, significant n, and included n are not z-curve estimates but they are grouped in this category for convenience.
See Also
Fit a z-curve to clustered data
Description
zcurve_clustered
is used to fit z-curve models to
clustered data. The function requires a data object created with the
zcurve_data()
function as the input (where id denotes clusters).
Two different methods that account for clustering ar implemented via
the EM model: "w"
for down weighting the likelihood of the test
statistics proportionately to the number of repetitions in the clusters,
and "b"
for a nested bootstrap where only a single study from each
bootstrap is selected for model fitting.
Usage
zcurve_clustered(
data,
method = "b",
bootstrap = 1000,
parallel = FALSE,
control = NULL
)
Arguments
data |
an object created with |
method |
the method to be used for fitting. Possible options are
down weighting |
bootstrap |
the number of bootstraps for estimating CI. To skip
bootstrap specify |
parallel |
whether the bootstrap should be performed in parallel.
Defaults to |
control |
additional options for the fitting algorithm more details in control EM. |
Value
The fitted z-curve object
References
There are no references for Rd macro \insertAllCites
on this help page.
See Also
zcurve()
, summary.zcurve()
, plot.zcurve()
, control_EM, control_density
Prepare data for z-curve
Description
zcurve_data
is used to prepare data for the
zcurve()
function. The function transform strings containing
reported test statistics "z", "t", "f", "chi", "p"
into two-sided
p-values. Test statistics reported as inequalities are as considered
to be censored as well as test statistics reported with low accuracy
(i.e., rounded to too few decimals). See details for more information.
Usage
zcurve_data(data, id = NULL, rounded = TRUE, stat_precise = 2, p_precise = 3)
Arguments
data |
a vector strings containing the test statistics. |
id |
a vector identifying observations from the same cluster. |
rounded |
an optional argument specifying whether de-rounding should be applied.
Defaults to |
stat_precise |
an integer specifying the numerical precision of
|
p_precise |
an integer specifying the numerical precision of p-values treated as exact values. |
Details
By default, the function extract the type of test statistic:
"F(df1, df2)=x"
F-statistic with df1 and df2 degrees of freedom,
"chi(df)=x"
Chi-square statistic with df degrees of freedom,
"t(df)=x"
for t-statistic with df degrees of freedom,
"z=x"
for z-statistic,
"p=x"
for p-value.
The input is not case sensitive and automatically removes empty spaces. Furthermore,
inequalities ("<"
and ">"
) can be used to denote censoring. I.e., that
the p-value is lower than "x"
or that the test statistic is larger than "x"
respectively. The automatic de-rounding procedure (if rounded = TRUE
) treats
p-values with less decimal places than specified in p_precise
or test statistics
with less decimal places than specified in stat_precise
as censored on an interval
that could result in a given rounded value. I.e., a "p = 0.03"
input would be
de-rounded as a p-value lower than 0.035 but larger than 0.025.
Value
An object of type "zcurve_data"
.
See Also
zcurve()
, print.zcurve_data()
, head.zcurve_data()
Examples
# Specify a character vector containing the test statistics
data <- c("z = 2.1", "t(34) = 2.21", "p < 0.03", "F(2,23) > 10", "p = 0.003")
# Obtain the z-curve data object
data <- zcurve_data(data)
# inspect the resulting object
data
Options for the zcurve package
Description
A placeholder object and functions for the zcurve package. (adapted from the runjags R package).
Usage
zcurve.options(...)
zcurve.get_option(name)
Arguments
... |
named option(s) to change - for a list of available options, see details below. |
name |
the name of the option to get the current value of - for a list of available options, see details below. |
Value
The current value of all available zcurve options (after applying any changes specified) is returned invisibly as a named list.