Title: | Analyze Data from Analysis of Behavior Experiments |
Version: | 1.0.6 |
Description: | Analyze data from behavioral experiments conducted using 'MED-PC' software developed by Med Associates Inc. Includes functions to fit exponential and hyperbolic models for delay discounting tasks, exponential mixtures for inter-response times, and Gaussian plus ramp models for peak procedure data, among others. For more details, refer to Alcala et al. (2023) <doi:10.31234/osf.io/8aq2j>. |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
Imports: | cluster, doParallel, data.table, dplyr, FNN, foreach, ggplot2, boot, grid, gridExtra, infotheo, ks, magrittr, minpack.lm, Polychrome, scales, sfsmisc, MASS, KernSmooth, zoo, usethis, stats, utils |
Depends: | R (≥ 3.5.0) |
License: | GPL (≥ 3) |
Author: | Emmanuel Alcala [aut, cre], Rodrigo Sosa [aut], Victor Reyes [aut] |
Maintainer: | Emmanuel Alcala <jealcalat@gmail.com> |
Date: | 2025-01-23 |
Suggests: | knitr, rmarkdown |
VignetteBuilder: | knitr |
LazyData: | true |
NeedsCompilation: | no |
Packaged: | 2025-01-29 19:10:08 UTC; jealcalat |
Repository: | CRAN |
Date/Publication: | 2025-01-31 11:00:14 UTC |
YEAB: Analyze Data from Analysis of Behavior Experiments
Description
Analyze data from behavioral experiments conducted using 'MED-PC' software developed by Med Associates Inc. Includes functions to fit exponential and hyperbolic models for delay discounting tasks, exponential mixtures for inter-response times, and Gaussian plus ramp models for peak procedure data, among others. For more details, refer to Alcala et al. (2023) doi:10.31234/osf.io/8aq2j.
Author(s)
Maintainer: Emmanuel Alcala jealcalat@gmail.com
Authors:
Rodrigo Sosa rsosas@up.edu.mx
Victor Reyes zickvic@hotmail.com
Delay Discounting Data
Description
Delay Discounting Data
Usage
DD_data
Format
A data frame with 6 rows and 2 columns:
- norm_sv
Normalized subjective values (numeric).
- Delay
Delays (in seconds) for rewards (numeric).
Details
A dataset containing normalized subjective values (SV) and delays used in a delay discounting task.
This dataset represents results from a delay discounting experiment. It demonstrates how subjective values decay with increasing delays.
Source
Generated for a delay discounting analysis.
Computes the Kullback-Leibler divergence based on kernel density estimates
Description
Computes the Kullback-Leibler divergence based on kernel density estimates of two samples.
Usage
KL_div(x, y, from_a, to_b)
Arguments
x |
numeric, the values from a sample p |
y |
numeric, the values from a sample q |
from_a |
numeric, the lower limit of the integration |
to_b |
numeric, the upper limit of the integration |
Details
The Kullback-Leibler divergence is defined as
D_{KL}(P||Q) = \int_{-\infty}^{\infty} p(x) \log \frac{p(x)}{q(x)} dx
Value
a numeric value that is the kl divergence
Examples
set.seed(123)
p <- rnorm(100)
q <- rnorm(100)
KL_div(p, q, -Inf, Inf) # 0.07579204
q <- rnorm(100, 10, 4)
KL_div(p, q, -Inf, Inf) # 7.769912
Normalization (or rescaling) between arbitrary a and b
Description
Normalization (or rescaling) between arbitrary a and b
Usage
ab_range_normalization(x, a, b)
Arguments
x |
numeric |
a |
numeric |
b |
numeric |
Value
A numeric vector rescaled in the range x' \in [a, b]
Examples
x <- 5:100
a <- 0
b <- 1
x_scaled <- ab_range_normalization(x, a, b)
x_scaled
a <- 100
b <- 1000
x_scaled <- ab_range_normalization(x, a, b)
x_scaled
Peak individual trial analysis using moving average
Description
Peak individual trial analysis using moving average
Usage
balci2019(tasa_norm, bins)
Arguments
tasa_norm |
numeric, normalized response rate |
bins |
numeric |
Details
Based on Balci et al 2010
Value
a list with params: a numeric vector with start, stop, spread and argmax (the bin at which response rate is max) mov_av: the moving average
Examples
data("r_times")
# binarize r_times to create response rate at 2 sec bins
bins <- get_bins(r_times, 0, 180, 2)
bin_res <- 6
tasa <- f_table(bins, 0, 180, bin_res)
tasa_norm <- tasa$prop / max(tasa$prop)
bins <- tasa$bins
balci_ind <- balci2019(tasa_norm, bins)
plot(bins, tasa_norm, xlab = "6 sec bins", )
lines(bins, balci_ind$mov_av, col = "blue", lwd = 2)
abline(v = balci_ind$params[c(1, 2, 4)], lwd = c(1, 1, 2), col = c(1, 1, "red4"))
Biexponential Refractory Model (BERM)
Description
Implements the biexponential refractory model (BERM) using maximum likelihood estimation to fit parameters for inter-response times (IRTs) within and between bouts.
The model is defined as:
p(IRT = \tau | \tau \ge \delta) = p w e^{-w (\tau - \delta)} + (1 - p) b e^{-b (\tau - \delta)}
where w
and b
are the rates for within and between bouts, p
is the proportion of responses in bouts,
and \delta
is the refractory period.
Calculates the negative log-likelihood for the BERM model.
Maps an unconstrained d_hat
onto the observed minimum inter-response time (d
), ensuring
that it aligns with model constraints.
Converts raw parameters into their constrained forms to enforce model constraints on
parameters such as w
, l0
, l1
, and d
.
Optimizes the log-likelihood function to estimate BERM model parameters based on observed inter-response times.
Usage
berm(irt, delta)
berm_log_likelihood(params, irt)
map_onto(d, d_hat)
param_conver(params, min_irt, parnames = c("w", "l0", "l1", "d"))
optimize_berm(irt)
Arguments
irt |
A numeric vector of inter-response times. |
delta |
A numeric value for the refractory period. |
params |
A numeric vector of raw, unconstrained parameters. |
d |
Minimum inter-response time. |
d_hat |
Transformed parameter to be mapped onto |
min_irt |
Minimum inter-response time for mapping |
parnames |
Optional vector of parameter names for labeling. |
Details
This function computes the negative log-likelihood based on biexponential functions
for the BERM model, adjusting parameters using param_conver
to meet constraints.
Value
A data frame with estimated parameters w
(within-bout rate), b
(between-bout rate), p
(proportion of responses in bouts), and \delta
(adjusted refractory period).
Negative log-likelihood value used for parameter estimation.
Adjusted refractory period used in likelihood estimation.
A named numeric vector of transformed parameters with constraints applied.
A named vector of optimized parameters for the BERM model.
Examples
set.seed(43)
l1 <- 1 / 0.5
l2 <- 1 / 0.1
p <- 0.4
n <- 200
delta <- 0.03
irt <- c(rexp(round(n * p), l1), rexp(round(n * (1 - p)), l2)) + delta
optimize_berm(irt)
set.seed(43)
l1 <- 1 / 0.5
l2 <- 1 / 0.1
p <- 0.4
n <- 200
delta <- 0.03
irt <- c(rexp(round(n * p), l1), rexp(round(n * (1 - p)), l2)) + delta
optimize_berm(irt)
Biexponential Model
Description
Implements a simpler biexponential model without the refractory period parameter, \delta
.
The simpler model is defined as:
p(IRT = \tau) = p w e^{-w \tau} + (1 - p) b e^{-b \tau}
where w
and b
represent the within- and between-bout response rates, and p
is the proportion of responses in bouts.
Usage
biexponential(irt)
Arguments
irt |
A numeric vector representing inter-response times. |
Value
A data frame with estimated parameters w
(proportion of responses in bouts), l0
(within-bout mean IRT),
and l1
(between-bout mean IRT).
Examples
set.seed(43)
l1 <- 1 / 0.5
l2 <- 1 / 0.1
p <- 0.4
n <- 200
irt <- c(rexp(round(n * p), l1), rexp(round(n * (1 - p)), l2))
biexponential(irt)
Find the best fit for individual trials using optim
Description
Find the best fit for individual trials by minimizing the negative sum of areas between the response rate and the target rate.
Usage
bp_opt(r_times, trial_duration, optim_method = "Brent")
Arguments
r_times |
Vector of response times |
trial_duration |
Duration of the trial |
optim_method |
character, the optimization method to use |
Value
A data frame with the following columns:
-
bp
: The breakpoint -
r1
: The response rate before the breakpoint -
r2
: The response rate after the breakpoint -
d1
: The duration of the first state -
d2
: The duration of the second state
Examples
data("r_times")
r_times <- r_times[r_times < 60]
bp_from_opt <- bp_opt(r_times, 60)
plot(r_times, seq_along(r_times),
xlim = c(0, max(r_times)),
main = "Cummulative Record",
xlab = "Time (s)",
ylab = "Cum Resp",
col = 2, type = "s"
)
abline(v = bp_from_opt$bp)
Find the nearest multiple
Description
Find the nearest multiple
Usage
ceiling_multiple(x, multiple)
Arguments
x |
numeric, the value for which we want to finde a multiple |
multiple |
numeric, the multiple |
Value
the nearest multiple
Examples
ceiling_multiple(8, 10) # returns 10
ceiling_multiple(12, 10) # returns 20
ceiling_multiple(21, 11) # returns 22
Curvature index using Fry derivation
Description
Curvature index using Fry derivation
Usage
curv_index_fry(cr, time_in, fi_val, n = 4)
Arguments
cr |
A numeric vector of cumulative response |
time_in |
numeric, time (or the x axis in a cumulative response plot) |
fi_val |
the FI value |
n |
numeric, the number of subintervals |
Value
The curvature index as exposed by Fry
Examples
data("r_times")
r_times <- r_times[r_times < 60]
cr <- seq_along(r_times)
plot(r_times, cr, type = "s", xlim = c(min(r_times), max(r_times)))
segments(
x0 = min(r_times), y0 = min(cr),
x1 = max(r_times), y1 = max(cr)
)
segments(
x0 = min(r_times) + (max(r_times) - min(r_times)) / 2, y0 = min(cr),
x1 = max(r_times), y1 = max(cr),
col = "red"
)
curv_index_fry(cr, r_times, 60, 4)
Curvature index by numerical integration
Description
Curvature index by numerical integration
Usage
curv_index_int(cr, time_in)
Arguments
cr |
numeric, cumulative response |
time_in |
numeric, time (or the x axis in a cumulative response plot) |
Value
a numeric value that is the proportion of a rect triangle area minus the area under the curve
Examples
data("r_times")
r_times <- r_times[r_times < 60]
cr <- seq_along(r_times)
plot(r_times, cr, type = "s")
curv_index_int(cr, r_times)
segments(
x0 = min(r_times), y0 = min(cr),
x1 = max(r_times), y1 = max(cr)
)
segments(
x0 = min(r_times) + (max(r_times) - min(r_times)) / 2, y0 = min(cr),
x1 = max(r_times), y1 = max(cr),
col = "red"
)
An example dataset of delays and normalized subjective values
Description
A dataset containing delay durations and their respective normalized subjective values (norm_sv).
Usage
dd_example
Format
A data frame with 6 rows and 2 variables:
- delay
The delay duration (in seconds).
- norm_sv
The normalized subjective value corresponding to the delay.
Examples
data(dd_example)
head(dd_example)
Shannon entropy in two dimensions
Description
Shannon entropy in two dimensions
Usage
entropy_kde2d(x, y, n_grid = 150)
Arguments
x |
numeric, random vector |
y |
numeric, random vector |
n_grid |
numeric, number of grid cells to evaluate density |
Value
A numeric value of the entropy in 2D
Examples
set.seed(123)
# Generate a 2D normal distribution with a correlation of 0.6
n <- 1000
mean <- c(0, 0)
sd_x <- 1
sd_y <- 5
correlation <- 0.6
sigma <- matrix(
c(
sd_x^2,
correlation * sd_x * sd_y,
correlation * sd_x * sd_y,
sd_y^2
),
ncol = 2
)
library(MASS)
simulated_data <- mvrnorm(n, mu = mean, Sigma = sigma)
x <- simulated_data[, 1]
y <- simulated_data[, 2]
cov_matr <- cov(cbind(x, y))
sigmas <- diag(cov_matr)
det_sig <- prod(sigmas)
# According to https://en.wikipedia.org/wiki/Multivariate_normal_distribution#Differential_entropy:
normal_entropy <- function(k, pi, det_sig) {
# The left part is a constant;
(k / 2) * (1 + log(2 * pi)) + (1 / 2) * log(det_sig)
}
entropia <- normal_entropy(k = 2, pi = pi, det_sig)
print(entropia) # Should return a value close to 4.3997
result <- entropy_kde2d(x, y, n_grid = 50)
print(result) # Should return a value close to 4.2177
Hyperbolic function
Description
An hyperbolic function to simulate delay discounting data
Usage
eq_hyp(k, delay)
Arguments
k |
numeric constant, the delay discounting parameter |
delay |
vector of delays |
Value
A numeric vector of subjective values between 0 and 1
Examples
delay <- seq(0, 10, len = 100)
k <- 0.2
sv <- eq_hyp(k, delay)
plot(delay, sv,
xlab = "delay",
ylab = "Sv",
type = "l"
)
Event extractor
Description
A function to slice data based on start and stop events. This function should be used after read_med.r, which outputs a csv of 2 columns: time and events (in that order). Its use is exemplified at the end of the function.
Usage
event_extractor(data_df, ev0, ev1, evname)
Arguments
data_df |
data frame with events ev0 and ev1 (e.g., start of trial and reinforcement delivery) |
ev0 |
event ID start (where the event we want to extract begins) |
ev1 |
event ID stop. This event won't be returned, so keep in mind that |
evname |
a string for the event name, for identification purposes. For example if the event we want to extract is component 1 in a multiple-2 schedule, this can be eventname = "c1", so when we extract the second component we can row-combine both in a unique dataframe. |
Details
Works by trials
Value
data frame with nrows x 4 columns of time, events, cum_id and evname
Examples
# If we have a component starting with 5 and ending with 3,
# say a Fixed Interval 15s and a dataframe of events from the read_med() function,
# we can extract the data of component "FI15" following the next steps:
# 0 - From the output of read_med.R function, load the extracted data and assign it to df
# 1 - source the event_extractor.R function
# 2 - use it with the appropiate arguments as follows
# read raw data from MED
data("fi60_raw_from_med")
# see first 10 lines
head(fi60_raw_from_med, 10)
# create a temporary file to avoid non-staged installation warning
temp_file <- tempfile(fileext = ".txt")
# write the data to the temporary file
writeLines(fi60_raw_from_med, temp_file)
# Process the file using read_med
example_processed <- read_med(
fname = temp_file, save_file = FALSE,
col_r = "C:", out = TRUE,
col_names = c("time", "event"), num_col = 6, time_dot_event = TRUE
)
# Extract specific events (FI15 in this case)
extracted_FI15 <- event_extractor(
data_df = example_processed,
ev0 = 5, ev1 = 3,
evname = "FI15"
)
# Display the first rows of the extracted data
head(extracted_FI15, 30)
Individual trial analysis for peak procedure data
Description
Individual trial analysis for peak procedure data
Usage
exhaustive_lhl(r_times, trial_duration)
Arguments
r_times |
numeric, the times that a response was emitted in a trial |
trial_duration |
numeric, the peak trial duration |
Value
a data.frame of start, stop, spread, middle time (mid) and the response rate at each state (r1 for low, r2 for high and r3 for the second low rate state)
Examples
data("r_times")
trial_duration <- max(r_times) |> ceiling() # 180
bps <- exhaustive_lhl(r_times, trial_duration)
plot(
density(
r_times,
adjust = 0.8,
from = 0,
to = 180
),
main = "",
ylab = expression(italic(p(t[R]))),
xlab = "time in peak trial"
)
abline(v = 60, lty = 2)
bps <- exhaustive_lhl(r_times, 180)
abline(v = c(bps$start, bps$stop), col = 2, lty = 2, lwd = 2)
# compare it with fwhm
den <- density(r_times, from = 0, to = trial_duration)
fval <- fwhm(den$x, den$y)
x1 <- fval$peak - fval$fwhm / 2
x2 <- fval$peak + fval$fwhm / 2
plot(den)
abline(v = c(x1, fval$peak, x2), col = c("blue", 1, "blue"))
Single breakpoint algorithm, the exhaustive version as the one used in Guilhardi & Church 2004
Description
Single breakpoint algorithm, the exhaustive version as the one used in Guilhardi & Church 2004
Usage
exhaustive_sbp(r_times, trial_duration)
Arguments
r_times |
numeric, the times at which a response was emitted in a trial |
trial_duration |
numeric, the duration of the IF interval |
Details
This algorithm performs an extensive search of every combination (t1, t2) where t1 starts in the first response through (length(r_times) - 1)
Value
A data frame of 3 columns
bp
a numeric value which corresponds to the time at which a break point was detected
r1
a numeric value of the response rate before the breakpoint
r2
a numeric value of the responser rate after the breakpoint
d1
a numeric value of the duration of the first state
d2
a numeric value of the duration of the second state
Examples
data("r_times")
r_times <- r_times[r_times < 60]
single_bp <- exhaustive_sbp(r_times, 60)
plot(r_times, seq_along(r_times),
xlim = c(0, max(r_times)),
main = "Cummulative Record",
xlab = "Time (s)",
ylab = "Cum Resp",
col = 2, type = "s"
)
abline(v = single_bp$bp)
bp_from_opt <- bp_opt(r_times, 60)
abline(v = bp_from_opt$bp, col = 3)
Exponential fit with nls
Description
This function performs an exponential fit using non-linear least squares (nls).
Usage
exp_fit(value, delay, initial_guess, max_iter = 1e+05, scale_offset = 0)
Arguments
value |
A numeric vector of the subjective values (indifference points). |
delay |
A numeric vector of the delays used in the experiment. |
initial_guess |
A numeric value providing an initial estimate for the parameter |
max_iter |
An integer specifying the maximum number of iterations for the nls fitting algorithm. Default is 1e5. |
scale_offset |
A numeric value for the scaling offset used in nls fitting control. Default is 0. |
Value
An object of class nls
containing the fitted model.
Examples
# See the examples of hyp_fit
Frequency table for binned data
Description
Creates a frequency table from a vector of bins from, for example, get_bins(). It includes zero-frequency bins. If the bins came from the responding times, this creates a data.frame of response rate.
Usage
f_table(x, min_x, max_x, bin_res)
Arguments
x |
numeric, a vector of binned data |
min_x |
numeric, the minimal value of x |
max_x |
numeric, the maximal value of x |
bin_res |
numeric, the bin resolution |
Value
A data frame
Examples
data("r_times")
bin_res <- 2
min_x <- 0
max_x <- 180
x <- get_bins(r_times, min_x, max_x, bin_res)
xt <- f_table(x, min_x, max_x, bin_res)
plot(xt$bins, xt$prop)
Raw Fixed Interval Data
Description
Raw Fixed Interval Data
Usage
fi60_raw_from_med
Format
A character vector containing lines of data from the file.
Details
A dataset containing raw data from a fixed interval (FI) experiment.
This dataset is obtained from a raw file generated during an FI experiment. It provides raw, unprocessed behavioral data.
Source
The raw data was read from the file: inst/extdata/fi60_raw.txt
.
Fleshler & Hoffman (1962) progression
Description
This function calculates the values of intervals approximately for an exponential distribution, but avoiding extremely large values.
Usage
fleshler_hoffman(N, VI)
Arguments
N |
The total number of intervals. |
VI |
The value of the Variable Interval |
Details
This function calculates the values of intervals approximately for a exponential distribution, but avoiding extremely large values which can produce extinction. It uses the formula derived from the Fleshler & Hoffman article, where the first factor of the equation is -log(1 - p)^(-1), representing the expected value or mean of the intervals. This value is also the inverse of a Poisson process, 1/lambda. Since we want the expected value or mean to be the value of the IV, we replace that constant with VI. The function handles the case when n = N, where the value becomes undefined (log(0)), by using L'Hopital's rule to find the limit of the function as n approaches N. The resulting values are then multiplied by the IV and the logarithm of N to obtain the final calculated values.
Value
A vector of calculated values for the intervals.
References
Fleshler, M., & Hoffman, H. S. (1962). A progression for generating variable-interval schedules. Journal of the Experimental Analysis of Behavior, 5(4), 529-530.
Examples
# Calculate intervals for N = 10, and IV = 30
N <- 15
iv <- 90
intervals <- round(fleshler_hoffman(N,iv), 3)
# Plot the intervals and the exponential distribution corresponding to the
# same mean (IV)
hist(intervals, freq = FALSE)
curve(dexp(x, rate = 1/iv), add = TRUE, col = 'red')
legend('topright', legend = c('F&H', 'Exponential'), lty = 1, col = c('black', 'red'))
Full Width at Half Maximum
Description
Full Width at Half Maximum
Usage
fwhm(x, y)
Arguments
x |
numeric, a vector of values from a distribution (density) |
y |
numeric, a vector of probabilities |
Details
The function allows to compute the spread of a symmetric function even when it is not normally distributed. It first finds the x at which y is max, then x1 and x2 can be recovered using x1=peak-fwhm/2, x2=peak+fwhm/2
Value
a list with the fwhm and the x at which the max ocurred
Examples
set.seed(170)
rx <- rnorm(100)
den <- density(rx)
fval <- fwhm(den$x, den$y)
x1 <- fval$peak - fval$fwhm / 2
x2 <- fval$peak + fval$fwhm / 2
plot(den)
abline(v = c(x1, fval$peak, x2), col = c(1, 2, 1))
Gaussian Example Data
Description
This dataset contains a series of bins and corresponding response averages from an experimental task.
Usage
gauss_example
Format
A data frame with 91 rows and 2 columns:
- Bin
Numeric. The bin number.
- Response_Average
Numeric. The average response in each bin.
Source
Generated as part of a synthetic example for the task.
Gaussian Example 1 Data
Description
This dataset contains a series of bins and corresponding response averages from another experimental task, similar to the first example.
Usage
gauss_example_1
Format
A data frame with 91 rows and 2 columns:
- Bin
Numeric. The bin number.
- Response_Average
Numeric. The average response in each bin.
Source
Generated as part of a synthetic example for the task.
Gaussian Example 2 Data
Description
This dataset contains a series of bins and corresponding response averages, this time with a slightly different distribution and experimental task design.
Usage
gauss_example_2
Format
A data frame with 91 rows and 2 columns:
- Bin
Numeric. The bin number.
- Response_Average
Numeric. The average response in each bin.
Source
Generated as part of a synthetic example for the task.
Gaussian + ramp fit with LM algorithm
Description
Gaussian + ramp fit with LM algorithm
Usage
gaussian_fit(
responses,
time_vec,
par = list(a = 0.1, d = 0.1, t0 = 18, b = 10, c = 1),
max.iter = 500
)
Arguments
responses |
numeric, vector of response or response rate |
time_vec |
numeric, time bins |
par |
a list of parameters for the gaussian + linear; see Buhusi, C. V., Perera, D., & Meck, W. H. (2005) for an explanation |
max.iter |
numeric, max number of iterations |
Details
Ver Buhusi, C. V., Perera, D., & Meck, W. H. (2005). Memory for timing visual and auditory signals in albino and pigmented rats.
This algorithm uses the nonlinear least squares nls.lm (Levenberg–Marquardt) from the minpack.lm package
Value
a numeric vector of coefficients, the same as the par argument
Examples
# Function to create synthetic data
g_plus_lin <- function(par, time) {
par$a * exp(-0.5 * ((time - par$t0) / par$b)**2) + par$c * (time - par$t0) + par$d
}
# real params
pars <- list(a = 20, t0 = 20, b = 10, c = 0.2, d = 1)
# time vector for simulation
ti <- seq(0, 60, 0.1)
# time vector for sampling with 2 sec of resolution
ti_data <- seq(0, 60, 2)
# r(t) real
y_curva <- g_plus_lin(par = pars, ti)
# r(t) sampled with noise
y_data <- g_plus_lin(par = pars, ti_data) + rnorm(length(ti_data), 0, sd = 2)
# param estimation
par_est <- gaussian_fit(responses = y_data, t = ti_data, par = pars, max.iter = 10500)
par_est
# fitted curve
y_hat <- g_plus_lin(par_est |> as.list(), ti)
# plot results
plot(ti,
y_curva,
type = "l",
col = "blue",
lwd = 2,
ylim = c(0, max(y_curva, y_data)),
xlab = "Time in trial",
ylab = "R(t)",
)
points(ti_data, y_data, pch = 21, bg = "red", cex = 1.2)
lines(
ti,
y_hat,
col = "green2",
lwd = 2
)
legend(
"topright",
legend = c("real", "real + noise", "ajuste nls.lm"),
lty = c(1, 0, 1),
pch = c(NA, 21),
pt.bg = c(NA, "red"),
col = c("blue", 1, "green2"),
pt.cex = 0.9,
cex = 0.6
)
Gellerman-like series
Description
Gellerman-like series
Usage
gell_like(n)
Arguments
n |
numeric, a vector of 0 and 1 (see Details) |
Details
The algorithm implements a Gellerman-like series based on Herrera, D., & Treviño, M. (2015). http://doi.org/10.1371/journal.pone.0136084 The algorithm samples from a binomial distribution and imposes two restrictions
no more than 3 consecutive values of 0s or 1s.
the number of trials 0 or 1 must be the same for a given n.
Value
a numeric vector of randomly distributed 0s and 1s
Examples
set.seed(165)
gell_like(8) # 0 0 1 1 1 0 1 0
A function to binarize a numeric vector with a given resolution
Description
A function to binarize a numeric vector with a given resolution
Usage
get_bins(x, x_min, x_max, res)
Arguments
x |
numeric, the vector to be binarized |
x_min |
numeric, the min value of a vector to create the bins (e.g., 0) |
x_max |
numeric, the maximum value of the vector x to binarize |
res |
numeric, the resolution; if x is time, res can be 1 s |
Value
the vector of bins for which x is in
Examples
x <- 1:20
get_bins(x, 0, 20, 5)
# Returns
# [1] 5 5 5 5 5 10 10 10 10 10 15 15 15 15 15 20 20 20 20 20
# set.seed(10)
x <- runif(20, 0, 10)
get_bins(x, 0, 10, 0.5)
# Returns
# 1] 5.5 3.5 4.5 7.0 1.0 2.5 3.0 3.0 6.5 4.5 7.0 6.0 1.5 6.0 4.0 4.5 1.0 3.0 4.0 8.5
Simulated Data for Hyperbolic Discounting
Description
Simulated Data for Hyperbolic Discounting
Usage
hyp_data
Format
A list with 3 elements:
- sv
A numeric vector of normalized subjective values with added noise.
- delay
A numeric vector of delays (in seconds).
- real_k
The real value of the discounting parameter.
Details
A list of simulated data for fitting hyperbolic discounting models.
This dataset was generated to simulate the behavior of a hyperbolic discounting function. It is commonly used in behavioral economics and psychology to study delay discounting behaviors.
Source
Generated using a custom simulation function.
Hypothetical dataset list for testing purposes
Description
This list contains three components: sv, delay, and real_k, representing subjective values, delay durations, and the real discount rate respectively.
Usage
hyp_data_list
Format
A list with three components:
- sv
A numeric vector of subjective values.
- delay
A numeric vector of delay durations (in arbitrary units).
- real_k
A numeric value representing the real discount rate.
Source
Hypothetical data generated for demonstration.
Examples
data(hyp_data_list)
str(hyp_data_list)
Hyperbolic fit with nls
Description
Hyperbolic fit with nls
Usage
hyperbolic_fit(value, delay, initial_guess, max_iter = 1e+05, scale_offset = 0)
Arguments
value |
A numeric vector of the subjective values (indifference points) |
delay |
A numeric vector of the delays used |
initial_guess |
A numeric value providing an initial start for k |
max_iter |
Positive integer with maximum number of iterations |
scale_offset |
A constant to be added if the residuals are close to 0. This is to avoid division by 0, which is know to cause problems of convergence. |
Value
An object of class nls
Examples
# Simulated data with k = 0.5
data("hyp_data")
delay <- hyp_data$delay
sv <- hyp_data$sv
real_k <- hyp_data$real_k
model_hyp <- hyperbolic_fit(sv, delay, initial_guess = 0.01)
summary(model_hyp)
k_est <- coef(model_hyp)
k_est
# plot real and estimated sv
delay_real <- seq(0, max(delay), len = 100)
# first, simulate how the data should look with the real k
real_sv <- eq_hyp(real_k, delay_real)
# simulate estimated fitting line
est_sv <- eq_hyp(k_est, delay_real)
plot(
delay, sv,
pch = 21,
col = 1,
bg = 8,
xlab = "Delay",
ylab = "Subjective value"
)
lines(
delay_real,
est_sv,
col = "red",
lwd = 2
)
# real data
lines(
delay_real,
real_sv,
type = "l",
col = "blue",
lwd = 2
)
legend(
"topright",
legend = c("data", "real", "fit"),
text.col = "white",
pch = c(21, NA, NA),
col = c(1, NA, NA),
pt.bg = c(8, NA, NA),
bty = "n"
)
legend(
"topright",
legend = c("data", "real", "fit"),
pch = c(NA, NA, NA),
lty = c(NA, 1, 1),
col = c(NA, "blue", "red"),
bty = "n"
)
# Now an example with real data
data("DD_data")
# first, fit a linear model
lineal_m <- lm(norm_sv ~ Delay, data = DD_data)
# hyperbolic model
hyp_m <- hyperbolic_fit(DD_data$norm_sv, delay = DD_data$Delay, 0.1)
# exponential model
exp_m <- exp_fit(DD_data$norm_sv, delay = DD_data$Delay, 0.1)
AIC(lineal_m, hyp_m, exp_m)
# compare visually
k_hyp <- coef(hyp_m)
k_exp <- coef(exp_m)
k_lin <- coef(lineal_m)
delay_vec <- seq(0, max(DD_data$Delay), len = 200)
plot(
DD_data$Delay,
DD_data$norm_sv,
ylim = c(0, 1),
pch = 21,
ylab = "SV",
xlab = "Delay",
bg = "orange",
col = "black"
)
lines(
delay_vec,
eq_hyp(k = k_hyp, delay_vec),
col = "green4",
lwd = 2
)
lines(
delay_vec,
exp(-k_exp * delay_vec),
col = "steelblue",
lwd = 2
)
abline(lineal_m, lty = 2, lwd = 2)
legend(
"topright",
legend = c("data", "exp fit", "hyp fit", "linear fit"),
text.col = "white",
pch = c(21, NA, NA, NA),
col = c(1, NA, NA, NA),
pt.bg = c("orange", NA, NA, NA),
bty = "n"
)
legend(
"topright",
legend = c("data", "exp fit", "hyp fit", "linear fit"),
pch = c(NA, NA, NA, NA),
lty = c(NA, 1, 1, 2),
col = c(NA, "steelblue", "green4", 1),
bty = "n"
)
# plot AIC values
aic_val <- AIC(lineal_m, hyp_m, exp_m) |> round(2)
leg <- sprintf(paste(rownames(aic_val), "= %s", sep = " "), aic_val$AIC)
legend(
"bottomleft",
title = "AIC\n(the smaller, the better)",
legend = leg,
bty = "n"
)
Objective function for finding the best fit for individual trials
Description
This function is used by optim
to find the best fit for
individual trials by minimizing the sum of areas between the response rate
and the target rate. Do not call this function directly.
Usage
ind_trials_obj_fun(params, r_times, trial_duration)
Arguments
params |
A vector of parameters to be optimized. |
r_times |
Vector of response times |
trial_duration |
Duration of the trial |
Value
a numeric value representing the sum of areas between the response rate and the target rate.
Find the best fit for individual trials using optim
Description
Find the best fit for individual trials by minimizing the sum of areas between the response rate and the target rate.
Usage
ind_trials_opt(r_times, trial_duration, optim_method = "Nelder-Mead")
Arguments
r_times |
Vector of response times |
trial_duration |
Duration of the trial |
optim_method |
Optimization method. See |
Value
A data frame with the following columns:
-
start
: The start time of the peak -
stop
: The stop time of the peak -
spread
: The spread of the peak (stop - start) -
middle
: The middle of the peak (mean of start and stop)
Examples
response_times <- c(
28.1, 40.7, 44.2, 44.4, 44.7, 45, 45.4, 47.9, 48.1, 48.3,
48.6, 48.8, 49.8, 50.2, 50.7, 51.2, 51.4, 51.7, 51.9, 52.7, 53, 53.5, 53.7,
53.9, 54.1, 54.3, 54.9, 55.3, 55.5, 55.7, 55.8, 57.2, 57.4, 57.7, 58.3,
58.5, 58.7, 60.4, 60.6, 60.7, 61.1, 61.6, 61.8, 62.6, 62.8, 63.1, 63.3,
63.5, 63.8, 64.4, 64.8, 64.9, 65.1, 66.1, 66.4, 67, 68.7, 68.9, 69.5, 69.6,
70.1, 70.9, 71, 71.3, 71.6, 71.8, 73.9, 74.1, 74.4, 74.6, 75.2, 76.4,
76.6, 77.4, 77.6, 77.8, 78.2, 79.3, 79.9, 80.5, 80.7, 81.3, 82.2, 82.4,
82.6, 82.9, 83, 83.1, 83.7, 84.4, 84.4, 84.8, 85, 85.6, 86.6, 87, 87.1,
87.3, 87.4, 87.8, 88.1, 88.2, 89.4, 99.1, 99.3, 99.6, 99.8, 100.2,
133.1, 133.1, 133.6, 134.9, 135.2, 135.3, 135.4, 135.7, 136.5, 173.8,
174.1, 174.3, 174.7, 175.9, 176.3, 176.6, 177.4, 177.5, 177.7, 178.1,
178.2, 178.4, 178.5, 178.8, 179.4
)
# Replace with your own initial guess
initial_guess <- c(min(response_times), mean(response_times))
trial_duration <- max(response_times)
result <- ind_trials_opt(response_times, trial_duration)
plot(
density(
response_times,
adjust = 0.8,
from = 0,
to = trial_duration
),
main = "Density plot of response times",
xlab = "Response time (ms)",
ylab = expression(italic(p(t[R]))),
)
abline(v = 60, lty = 2)
abline(v = result$start, col = "red")
abline(v = result$stop, col = "red")
abline(v = result$middle, col = "red")
Mutual information of continuous variables using discretization
Description
Mutual information of continuous variables using discretization
Usage
mut_info_discrete(x, y, method = "emp")
Arguments
x |
A numeric vector |
y |
A numeric vector or equal or unequal size as x |
method |
The method to estimate entropy; available methods are "emp", "mm", "shrink", "sg" (default:"emp"). See details |
Details
This function is based on the infotheo package. It uses equalfreq discretization by default. x and y need not be of equal size.
Value
A numeric value representing the mutual information between x and y
References
Meyer, P. E. (2008). Information-Theoretic Variable Selection and Network Inference from Microarray Data. PhD thesis of the Universite Libre de Bruxelles.
Examples
set.seed(123)
x <- rnorm(1000)
y <- rnorm(1000)
plot(x, y)
# close to 0 if they are independent
mut_info_discrete(x, y)
y <- 100 * x + rnorm(length(x), 0, 12)
plot(x, y)
# far from 0 if they are not independent
mut_info_discrete(x, y)
# simulate a sine function with noise
set.seed(123)
x <- seq(0, 5, 0.1)
y <- 5 * sin(x * pi)
y_with_noise <- y + rnorm(length(x), 0, 1)
plot(x, y_with_noise)
lines(x, y, col = 2)
# add a regression line
abline(lm(y ~ x))
# compute correlation coefficient; for nonlinear functions is close to 0
cor(x, y_with_noise)
# mutual information can detect nonlinear dependencies
mut_info_discrete(x, y_with_noise)
Mutual Information for Continuous Variables using kNN
Description
Mutual Information for Continuous Variables using kNN
Usage
mut_info_knn(x, y, k = 5, direct = TRUE)
Arguments
x |
Numeric vector. |
y |
Numeric vector. |
k |
Number of nearest neighbors to use; default is 5. |
direct |
Logical; if |
Value
Numeric; an estimate of the mutual information.
Examples
set.seed(123)
x <- rnorm(1000)
y <- rnorm(1000)
# Close to 0 if they are independent
mut_info_knn(x, y, k = 5)
y <- 100 * x + rnorm(length(x), 0, 12)
# Far from 0 if they are not independent
mut_info_knn(x, y, k = 5)
Find maximum value within intervals
Description
This function searches for the maximum value within a distribution (represented by vector x) that falls within a series of intervals specified by the vector intervals.
Usage
n_between_intervals(x, intervals, time_in)
Arguments
x |
A numeric vector representing the distribution from which to find the maximum value within intervals. |
intervals |
A numeric vector specifying the intervals within which to search for the maximum value. |
time_in |
A numeric vector representing the corresponding time points for the values in the vector x, which is used to determine whether the values fall within the specified intervals. |
Value
A numeric vector containing the maximum value within each interval specified by 'intervals'. If no values fall within an interval, returns 0 for that interval.
Examples
# Create a vector of data with a logarithmically increasing distribution
log_data <- round(exp(seq(log(1), log(100), length.out = 100)))
# Define intervals to cover the range 1-100
intervals <- seq(1, 100, by = 20)
# Create a corresponding time vector
time_in <- seq(1, 100, length.out = 100)
# Find maximum value within intervals
n_between_intervals(log_data, intervals, time_in)
Objective function for the breakpoint optimization algorithm
Description
Objective function for the breakpoint optimization algorithm
for fixed interval trials. This function is used by optim
to find the
optimal breakpoint. Do not call this function directly.
Usage
objective_bp(param, r_times, trial_duration)
Arguments
param |
A numeric value of the breakpoint |
r_times |
A numeric vector of response times |
trial_duration |
A numeric value of the trial duration |
Value
A numeric value representing the sum of areas between the response rate and the target rate.
Optimization Function for the Biexponential Model
Description
Optimizes the log-likelihood function to estimate biexponential model parameters based on observed inter-response times.
Calculates the negative log-likelihood for the simpler biexponential model, which does not include the
refractory period parameter, \delta
.
Usage
optimize_biexponential(irt)
biexponential_log_likelihood(params, irt)
Arguments
irt |
A numeric vector representing inter-response times. |
params |
A numeric vector of initial parameter estimates for optimization. |
Details
This function computes the negative log-likelihood based on biexponential functions for the simpler biexponential model, adjusting parameters using transformations to meet constraints.
Value
A named vector of optimized parameters for the biexponential model.
Negative log-likelihood value used for parameter estimation.
Reaction Times from Peak Procedure
Description
Reaction Times from Peak Procedure
Usage
r_times
Format
A numeric vector with 132 elements representing reaction times.
Details
A dataset containing reaction times (in seconds) from an experiment using the peak procedure.
These times are derived from a peak procedure experiment, typically used in behavioral experiments to measure timing abilities in subjects.
Source
Generated during a behavioral analysis experiment.
Process MED to csv based on standard data structure event.time
Description
Process MED to csv based on standard data structure event.time
Usage
read_med(
fname,
save_file = FALSE,
path_save = NULL,
col_r = "C:",
col_names = c("time", "event"),
out = TRUE,
num_col = 6,
time_dot_event = TRUE,
...
)
Arguments
fname |
chr, the name of a MED file to read; can include the directory |
save_file |
logical, save csv on the disk? TRUE or FALSE (default) |
path_save |
chr, directory to save csv files if save_file is TRUE; |
col_r |
chr, MED array to read (may be an event.time variable; see Details) |
col_names |
chr, a vector of column names |
out |
logical, if true returns the data.frame of n x 2 |
num_col |
int, corresponds to DISKCOLUMNS of MED |
time_dot_event |
logical, if true, assumes that array to process has a time.event format |
... |
other arguments passed to |
Details
The default behavior of this function has time_dot_event = TRUE, which means that the raw MED can be should be in time.event convention. For example, if a response is coded as 23, the time is in 1/100 seconds and a response occurred at 2 minutes, the event is saved in, say, column C as 6000.23. This will be processed as time event 6000 23
However, if time_dot_event = FALSE, the output will be a data.frame with one column values. For example values 6000.23
Value
if out is true, returns a data.frame; if save_file is TRUE, writes the data.frame in csv format at path_save
Examples
# read raw data from MED
data("fi60_raw_from_med")
# see first 10 lines
head(fi60_raw_from_med, 10)
# create a temporary file to avoid non-staged installation warning
temp_file <- tempfile(fileext = ".txt")
# write the data to the temporary file
writeLines(fi60_raw_from_med, temp_file)
# Use the temporary file for processing
fi60_processed <- read_med(fname = temp_file, save_file = FALSE,
col_r = "C:", out = TRUE,
col_names = c("time", "event"), num_col = 6, time_dot_event = TRUE)
head(fi60_processed)
# __________________________________________________________________________
## To use in bulk
# 1) Generate a list of filenames of raw MED data
# 2) Loop over the list with the function, using each element
# of the list as the fname argument.
# __________________________________________________________________________
# Suppose all raw MED files start with 2020, and you are in the working directory
# If all the raw MED files are in the wd, we can directly get the filenames
# with unspecified path
# filenames <- list.files(pattern = "^2020")
# The above line will look in the wd for all the files starting with "2020"
# and it will save it as a vector of strings in "filenames".
# With that vector, make a for loop like the following:
# __________________________________________________________________________
# If you want to work immediately with the processed data, first create an empty
# dataframe to store the data file per file
# df_working = data.frame()
# __________________________________________________________________________
# for (f in filenames) {
# df_tmp <- read_med(fname = f,
# path_save = "data/processed/", # put here your path to save the csv
# col_r = 'C:', # if the time.event vector is saved in variable C
# out = TRUE ) # If you want to store processed data in df_tmp,
# otherwise write out = FALSE
# now append at rows the new data.frame
# df_working = rbind(df_working, df_tmp)
# }
# Thats all.
Sample from a density estimate
Description
Sample from a density estimate
Usage
sample_from_density(x, n)
Arguments
x |
A numeric variable from a (un)known distribution |
n |
Number of samples to return |
Value
A sample with distribution close to x
Examples
x <- rnorm(1000)
y <- sample_from_density(x, 1000)
hist(x,
breaks = 30,
freq = FALSE,
col = "grey",
main = "Original data"
)
Area under the curve (AUC)
Description
Calculate the area under the curve (AUC) using the trapezoid method.
Usage
trapezoid_auc(x, y)
Arguments
x |
A numeric vector of x values |
y |
A numeric vector of y values |
Value
A numeric value of the area under the curve
Examples
x_values <- c(0, 1, 2, 3, 4) # Delay times
y_values <- c(1, 0.8, 0.6, 0.4, 0.2) # Discounted values
auc_result <- trapezoid_auc(x_values, y_values)
print(paste("Area Under Curve: ", auc_result))
Min-max normalization (also feature rescaling)
Description
Min-max normalization (also feature rescaling)
Usage
unit_normalization(x)
Arguments
x |
numeric, vector of values to rescale |
Value
A numeric vector rescaled in the range x' \in [0, 1]
Examples
x <- 5:100
x_scaled <- unit_normalization(x)
x_scaled
True value in interval
Description
True value in interval
Usage
val_in_interval(df, lowLim, upLim, true.val)
Arguments
df |
A data frame containing the intervals to be evaluated. Each row should correspond to an interval with lower and upper limits. |
lowLim |
the column index or name in the data frame |
upLim |
the column index or name in the data frame |
true.val |
the true value to be checked if it falls within
the interval defined by |
Value
A numeric vector of n elements with an integer value for each interval: 0 if the value is below the interval, 1 if it is inside the interval (with a rightmost open limit), and 2 if it is above the interval.
Examples
# Example data frame with intervals
df <- data.frame(lower = c(1, 5, 10), upper = c(3, 8, 15))
# Check if the value 6 is within any of the intervals
val_in_interval(df, "lower", "upper", 6)