Type: Package
Title: Process CBASS-Derived PAM Data
Version: 0.2.0
Maintainer: Luigi Colin <reefgenomics@gmail.com>
Description: Tools to process CBASS-derived PAM data efficiently. Minimal requirements are PAM-based photosynthetic efficiency data (or data from any other continuous variable that changes with temperature, e.g. relative bleaching scores) from 4 coral samples (nubbins) subjected to 4 temperature profiles of at least 2 colonies from 1 coral species from 1 site. Please refer to the following CBASS (Coral Bleaching Automated Stress System) papers for in-depth information regarding CBASS acute thermal stress assays, experimental design considerations, and ED5/ED50/ED95 thermal parameters: Nicolas R. Evensen et al. (2023) <doi:10.1002/lom3.10555> Christian R. Voolstra et al. (2020) <doi:10.1111/gcb.15148> Christian R. Voolstra et al. (2025) <doi:10.1146/annurev-marine-032223-024511>.
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.3.2
Suggests: testthat (≥ 3.0.0)
Config/testthat/edition: 3
Imports: drc, rlog, stats, utils, dplyr, ggplot2, readxl, glue
Depends: R (≥ 3.5.0)
Date: 2025-05-28
NeedsCompilation: no
Packaged: 2025-06-04 10:40:29 UTC; colinl
Author: Yulia Iakovleva ORCID iD [aut], Luigi Colin ORCID iD [aut, cre], Christian Robert Voolstra ORCID iD [aut]
Repository: CRAN
Date/Publication: 2025-06-06 12:50:08 UTC

Calculate ED5, ED50, and ED95 values for all samples in the dataset.

Description

Calculate ED5, ED50, and ED95 values for all samples in the dataset.

Usage

calculate_eds(
  cbass_dataset,
  grouping_properties = c("Site", "Condition", "Species", "Timepoint"),
  drm_formula = "Pam_value ~ Temperature"
)

Arguments

cbass_dataset

A data frame containing the dataset to be processed.

grouping_properties

A character vector of column names to be used for grouping. Default: c("Site", "Condition", "Species", "Timepoint").

drm_formula

A formula object specifying the dose-response model. Default: "Pam_value ~ Temperature".

Value

A data frame with ED5, ED50, and ED95 values for each grouping property.

Examples

# Example dataset
data(cbass_dataset)

# Extract the ED5, ED50, and ED95 values as a data frame
eds_df <- calculate_eds(cbass_dataset)

CBASS Dataset

Description

A dataset containing example simulated experimental data.

Usage

data(cbass_dataset)

Format

A data frame with 240 observations and 9 variables:

Project

Name to identify the project/experiment.

Latitude

Latitude of the observation collection site in decimal format.

Longitude

Longitude of the observation collection site in decimal format.

Date

Date of the observation in YYYYMMDD format.

Country

Country of the observation in 3-letter ISO country code format.

Site

Site of the observation, e.g., name of the reef.

Condition

Descriptor to set apart samples from the same species and site, e.g. probiotic treatment vs. control; nursery vs. wild; diseased vs. healthy; can be used to designate experimental treatments besides heat stress. If no other treatments, write 'not available'.

Species

Species of the observation. We recommend providing the name of the coral as accurate as possible, e.g. Porites lutea or Porites sp.

Genotype

Denotes samples/fragments/nubbins from distinct colonies in a given dataset; we recommend to use integers, i.e. 1, 2, 3, 4, 5, etc.

Temperature

CBASS treatment temperatures; must be \ge 4 different temperatures; must be integer; e.g. 30, 33, 36, 39. Typical CBASS temperature ranges are average summer mean MMM, MMM+3°C, MMM+6°C, MMM+9°C).

Timepoint

Timepoint of PAM measurements in minutes from start of the thermal cycling profile; typically: 420 (7 hours after start, i.e., after ramping up, heat-hold, ramping down) or 1080 (18 hours after start, i.e. at the end of the CBASS thermal cycling profile); differences in ED50s between timepoints 420 and 1080 may be indicative of resilience/recovery (if 1080 ED50 > 420 ED50) or collapse (if 1080 ED50 < 420 ED50).

Pam_value

Fv/Fm value of a given sample (format: \ge0 and \le1, e.g. 0.387); note that technically any continuous variable can be used for ED50 calculation (e.g., coral whitening; black/white pixel intensity; etc.) and be provided in this column.

Details

This dataset provides experimental data with various attributes for demonstration and testing purposes.

Examples

# Load the sample dataset
data(cbass_dataset)
head(cbass_dataset)


Check if the dataset contains enough unique temperature values

Description

Check if the dataset contains enough unique temperature values

Usage

check_enough_unique_temperatures_values(dataset)

Arguments

dataset

The input dataset containing the 'Temperature' column to be analyzed. The 'Temperature' column should be numeric representing temperature values.

Value

Logical value (TRUE or FALSE) indicating whether there are enough unique temperature values (at least 4) in the dataset.

Examples

data <- data.frame(Temperature = c(25, 30, 25, 35, 28, 28))
check_enough_unique_temperatures_values(data)
# Output: TRUE

Check and Convert Columns in Dataset

Description

This function checks the data types of specific columns in the provided dataset and converts them to the desired data type if necessary. The function is designed to ensure that certain columns are represented as factors or numeric values, as required for further analysis.

Usage

convert_columns(dataset)

Arguments

dataset

A data frame containing the dataset to be checked and modified.

Value

A modified version of the input dataset with the specified columns converted to factors or numeric types, as appropriate.

Examples

# Sample dataset
data <- data.frame(
                Project = c("Project A", "Project A", "Project B"),
                Date = c("2023-07-31", "2023-08-01", "2023-08-02"),
                Site = c("Site X", "Site Y", "Site Z"),
                Country = c("Country A", "Country B", "Country C"),
                Latitude = c(34.05, 36.16, 40.71),
                Longitude = c(-118.24, -115.15, -74.01),
                Species = c("Species 1", "Species 2", "Species 3"),
                Genotype = c("Genotype A", "Genotype B", "Genotype C"),
                Condition = c("Condition 1", "Condition 2", "Condition 3"),
                Timepoint = c("Timepoint 1", "Timepoint 1", "Timepoint 2"),
                Temperature = c(30.2, 31.5, 29.8),
                Pam_value = c(0.5, 0.6, 0.8)
                )

# Convert columns in the dataset
modified_data <- convert_columns(data)

# The 'Site', 'Condition', 'Species', and 'Genotype' columns are now factors,
# and 'Temperature' and 'PAM' columns are now numeric in the modified_data.

Check if the dataset has all mandatory columns

Description

This function checks if a given dataset contains all the mandatory columns specified in the mandatory_columns vector.

Usage

dataset_has_mandatory_columns(dataset)

Arguments

dataset

A data frame representing the dataset to be validated.

Value

A logical value indicating whether all the mandatory columns are present in the dataset.

Examples

# Sample dataset
sample_data <- data.frame(
                Project = c("Project A", "Project A", "Project B"),
                Date = c("2023-07-31", "2023-08-01", "2023-08-02"),
                Site = c("Site X", "Site Y", "Site Z"),
                Country = c("Country A", "Country B", "Country C"),
                Latitude = c(34.05, 36.16, 40.71),
                Longitude = c(-118.24, -115.15, -74.01),
                Species = c("Species 1", "Species 2", "Species 3"),
                Genotype = c("Genotype A", "Genotype B", "Genotype C"),
                Condition = c("Condition 1", "Condition 2", "Condition 3"),
                Timepoint = c("Timepoint 1", "Timepoint 1", "Timepoint 2"),
                Temperature = c(30.2, 31.5, 29.8),
                Pam_value = c(0.5, 0.6, 0.8)
                )

dataset_has_mandatory_columns(sample_data)
# Output: TRUE

# Sample dataset with missing columns
missing_columns_data <- data.frame(Label = c("A", "B", "C"),
                                   Date = c("2023-07-31", "2023-08-01", "2023-08-02"))

dataset_has_mandatory_columns(missing_columns_data)
# Output: FALSE

Define Grouping Property Column

Description

This function creates a new column 'GroupingProperty' in the provided dataset by merging specified columns using a specified separator. The new column is created as a factor variable.

Usage

define_grouping_property(dataset, grouping_properties, sep = "_")

Arguments

dataset

A data frame where the new 'GroupingProperty' column will be added.

grouping_properties

A character vector containing the names of the columns to be merged.

sep

A character string used as a separator when merging columns (default is "_").

Value

A data frame with the added 'GroupingProperty' column.

Examples

# Create a sample data frame
data <- data.frame(Category = c("A", "B", "C"),
                   Subcategory = c("X", "Y", "Z"),
                   Value = c(10, 20, 30))

# Define grouping property using 'Category' and 'Subcategory'
new_data <- define_grouping_property(data, c("Category", "Subcategory"), sep = "-")

# Resulting data frame:
#   Category Subcategory Value GroupingProperty
# 1        A           X    10              A-X
# 2        B           Y    20              B-Y
# 3        C           Z    30              C-Z

Define Temperature Ranges

Description

This function takes a vector of temperatures and generates a sequence of temperature values that span the range of input temperatures.

Usage

define_temperature_ranges(temperatures, n = 100)

Arguments

temperatures

A numeric vector containing temperature values.

n

An integer specifying the length of the output sequence. Default is 100.

Value

A numeric vector containing a sequence of temperature values ranging from the minimum temperature to the maximum temperature plus 1, evenly spaced based on the specified length.

Examples

temperatures <- c(10, 15, 20, 25, 30)
define_temperature_ranges(temperatures)

Plot an exploratory temperature response curve.

Description

Plot an exploratory temperature response curve.

Usage

exploratory_tr_curve(
  cbass_dataset,
  grouping_properties = c("Site", "Condition", "Species", "Timepoint"),
  faceting = "Species ~ Site ~ Condition",
  size_text = 12,
  size_points = 2
)

Arguments

cbass_dataset

A data frame containing the dataset to be processed.

grouping_properties

A character vector of column names to be used for grouping. Default: c("Site", "Condition", "Species", "Timepoint").

faceting

A formula specifying the faceting of the plot. Default: "Species ~ Site ~ Condition".

size_text

Default: 12. A formula specifying the faceting of the plot.

size_points

Default: 2. A formula specifying the faceting of the plot.

Value

A ggplot object representing the temperature response curve.

Examples

# Load example dataset
data(cbass_dataset)

# Create an exploratory temperature response curve
exploratory_curve <- exploratory_tr_curve(cbass_dataset)

Fit dose-response models and calculate summary statistics for ED5, ED50, and ED95 values.

Description

Fit dose-response models and calculate summary statistics for ED5, ED50, and ED95 values.

Usage

fit_curve_eds(
  cbass_dataset,
  grouping_properties = c("Site", "Condition", "Species", "Timepoint"),
  drm_formula = "Pam_value ~ Temperature"
)

Arguments

cbass_dataset

A data frame containing the dataset to be processed.

grouping_properties

A character vector of column names to be used for grouping. Default: c("Site", "Condition", "Species", "Timepoint").

drm_formula

A formula object specifying the dose-response model. Default: "Pam_value ~ Temperature".

Value

A data frame with summary statistics for ED5, ED50, and ED95 values.

Examples

# Example dataset
data(cbass_dataset)

# Example grouping properties
grouping_properties <- c("Site", "Condition", "Species", "Timepoint")

# Extract the ED5, ED50, and ED95 values as a data frame
fitted_edss_df <- fit_curve_eds(cbass_dataset, grouping_properties)

Fit Dynamic Regression Models (DRMs)

Description

This function fits dynamic regression models (DRMs) to a given dataset using the specified grouping properties and DRM formula.

Usage

fit_drms(
  dataset,
  grouping_properties,
  drm_formula,
  is_curveid = FALSE,
  LL.4 = FALSE
)

Arguments

dataset

A data frame containing the dataset on which to fit the DRMs.

grouping_properties

A character vector specifying the names of columns in the dataset that will be used as grouping properties for fitting separate DRMs.

drm_formula

A formula specifying the dynamic regression model to be fitted. This formula should follow the standard R formula syntax (e.g., y ~ x1 + x2).

is_curveid

A boolean value indicating if you want to use this parameter to fit the model

LL.4

Logical. If TRUE, the LL.4 model is used instead of LL.3. LL.3 is preferred, as PAM data is expected to never be lower than zero. In cases with overly correlated data and steep slopes, LL.4 allows the lower limit to vary, which can help to better fit the model.

Value

A list of fitted DRM models, with each element corresponding to a unique combination of grouping property values.

Examples

data(cbass_dataset)
preprocessed_data <- preprocess_dataset(cbass_dataset)
fit_drms(preprocessed_data,
       c("Site", "Condition", "Species", "Genotype"),
        "Pam_value ~ Temperature")


Get ED5s, ED50s and ED95s by Grouping Property

Description

This function takes a list of models and extracts the ED5s, ED50s and ED95s values for each model based on a specified grouping property. The ED5s, ED50s and ED95s values is extracted from the model's coefficients and is associated with the intercept term.

Usage

get_all_eds_by_grouping_property(models)

Arguments

models

A list of models where each element represents a model object containing coefficients.

Value

A data frame containing the ED50 values along with their corresponding grouping property. Each row represents a model's ED50 value and its associated grouping property.

Examples

data(cbass_dataset)
preprocessed_data <- preprocess_dataset(cbass_dataset)
model_list <- fit_drms(preprocessed_data,
           c("Site", "Condition", "Species", "Genotype"),
            "Pam_value ~ Temperature", is_curveid = TRUE)
eds_data <- get_all_eds_by_grouping_property(model_list)

# Resulting data frame structure:
#   ED5         ED50         ED95         GroupingProperty
# 1 ED5_value_1 ED50_value_1 ED95_value_1 Group1
# 2 ED5_value_2 ED50_value_2 ED95_value_2 Group2

Get ED50 by Grouping Property

Description

This function takes a list of models and extracts the ED50 value for each model based on a specified grouping property. The ED50 value is extracted from the model's coefficients and is associated with the intercept term.

Usage

get_ed50_by_grouping_property(models)

Arguments

models

A list of models where each element represents a model object containing coefficients.

Value

A data frame containing the ED50 values along with their corresponding grouping property. Each row represents a model's ED50 value and its associated grouping property.

Examples

data(cbass_dataset)
preprocessed_data <- preprocess_dataset(cbass_dataset)
model_list <-fit_drms(preprocessed_data,
           c("Site", "Condition", "Species", "Genotype"),
            "Pam_value ~ Temperature")
ed50_data <- get_ed50_by_grouping_property(model_list)

# Resulting data frame structure:
#   ED50     GroupingProperty
# 1 ED50_value_1 Group1
# 2 ED50_value_2 Group2

Get Predicted PAM Values

Description

This function takes a list of models and a temperature range, and generates predicted PAM (Pulse Amplitude Modulation) values based on the provided models and temperature range.

Usage

get_predicted_pam_values(models, temp_range)

Arguments

models

A list of model objects representing PAM prediction models.

temp_range

A numeric vector containing a sequence of temperature values for which PAM predictions will be generated.

Value

A data frame containing the predicted PAM values along with corresponding temperature values from the given temperature range.

See Also

predict_temperature_values, transform_predictions_to_long_dataframe, define_temperature_ranges predict_temperature_values transform_predictions_to_long_dataframe define_temperature_ranges

Examples

# Load models and temperature range
# To load internal dataset that is provided with the R package
data("cbass_dataset")
cbass_dataset <- preprocess_dataset(cbass_dataset)
grouping_properties <- c("Site", "Condition", "Species", "Timepoint")
drm_formula <- "Pam_value ~ Temperature"

# Make list of model
models <- fit_drms(cbass_dataset, grouping_properties, drm_formula, is_curveid = FALSE)
temp_ranges <- define_temperature_ranges(cbass_dataset$Temperature, n = 100)

# Get predicted Pam_value values
predicted_pam <- get_predicted_pam_values(models, temp_ranges)


Get the names of mandatory columns for the dataset.

Description

This function returns a character vector containing the names of columns that are considered mandatory for a dataset to meet certain requirements.

Usage

mandatory_columns()

Value

A character vector containing the names of mandatory columns.

Examples

mandatory_cols <- mandatory_columns()
print(mandatory_cols)

# [1] "Project"     "Date"        "Site"        "Genotype"   
# [5] "Species"     "Country"     "Latitude"    "Longitude"  
# [9] "Condition"   "Temperature" "Timepoint"   "Pam_value"  


Plot a boxplot of ED50 values for different species and conditions.

Description

Plot a boxplot of ED50 values for different species and conditions.

Usage

plot_ED50_box(
  cbass_dataset,
  grouping_properties = c("Site", "Condition", "Species", "Timepoint"),
  drm_formula = "Pam_value ~ Temperature",
  Condition = "Condition",
  faceting = "~ Site",
  size_text = 12,
  size_points = 2
)

Arguments

cbass_dataset

A data frame containing the dataset to be processed.

grouping_properties

A character vector of column names to be used for grouping. Default: c("Site", "Condition", "Species", "Timepoint")

drm_formula

A formula object specifying the dose-response model. Default: "Pam_value ~ Temperature".

Condition

A character string specifying the condition to be used for coloring the plot. Default: "Condition".

faceting

A formula specifying the faceting of the plot. Default: "~ Site".

size_text

Default: 12. A formula specifying the faceting of the plot.

size_points

Default: 2. A formula specifying the faceting of the plot.

Value

A ggplot object representing the boxplot of ED50 values.

Examples

# Example dataset
data(cbass_dataset)

# Example grouping properties
grouping_properties <- c("Site", "Condition", "Species", "Timepoint")

# Make ggplot object
boxplot_ED50 <- plot_ED50_box(cbass_dataset)


Plot the model curve with predicted PAM values and confidence intervals.

Description

Plot the model curve with predicted PAM values and confidence intervals.

Usage

plot_model_curve(
  cbass_dataset,
  grouping_properties = c("Site", "Condition", "Species", "Timepoint"),
  drm_formula = "Pam_value ~ Temperature",
  faceting_model = "Species ~ Site ~ Condition",
  size_text = 12,
  size_points = 2
)

Arguments

cbass_dataset

A data frame containing the dataset to be processed.

grouping_properties

A character vector of column names to be used for grouping. Default: c("Site", "Condition", "Species", "Timepoint").

drm_formula

A formula object specifying the dose-response model. Default: "Pam_value ~ Temperature".

faceting_model

A formula specifying the faceting of the plot. Default: "Species ~ Site ~ Condition".

size_text

Default: 12. A formula specifying the faceting of the plot.

size_points

Default: 2. A formula specifying the faceting of the plot.

Value

A ggplot object representing the model curve with predicted PAM values.

Examples

data(cbass_dataset)

model_curve_plot <- plot_model_curve(cbass_dataset)

Predict the temperature values

Description

This function takes a list of models and generates a sequence of temperature values that span the range of input temperatures.

Usage

predict_temperature_values(models, temp_range)

Arguments

models

A list of models where each element represents a model object containing coefficients.

temp_range

the temperature range to be used for predictions from the function define_temperature_ranges

Value

A data frame containing the predicted PAM values for each temperature along with their corresponding grouping property. Each row represents a model's predicted PAM value and its associated grouping property and confidence interval.

Examples

data(cbass_dataset)
preprocessed_data <- preprocess_dataset(cbass_dataset)

models <- fit_drms(preprocessed_data,
                 c("Site", "Condition", "Species", "Timepoint"),
                   "Pam_value ~ Temperature", is_curveid = TRUE)
temp_ranges <- define_temperature_ranges(cbass_dataset$Temperature, n = 100)
predict_temperature_values(models, temp_ranges)


Preprocesses the data by converting and checking column data types.

Description

This function preprocesses the input data by performing checks on column data types and converting them if necessary. It ensures that the dataset meets certain requirements before further analysis or modeling.

Usage

preprocess_dataset(dataset)

Arguments

dataset

A data frame containing the dataset to be preprocessed.

Value

A preprocessed data frame with converted and validated column data types.

Examples

# Load a sample dataset
data(cbass_dataset)
# Preprocess the dataset
preprocessed_data <- preprocess_dataset(cbass_dataset)

Experimental Features for CBASS Dataset Analysis

Description

This script contains functions for processing and analyzing CBASS datasets, including fitting dose-response models, calculating effective doses (ED5, ED50, ED95), and plotting results.

Process the dataset by preprocessing, validating, and defining grouping properties.

Usage

process_dataset(cbass_dataset, grouping_properties)

Arguments

cbass_dataset

A data frame containing the dataset to be processed.

grouping_properties

A character vector of column names to be used for grouping. Default: c("Site", "Condition", "Species", "Timepoint").

Value

A processed data frame with the grouping property defined.

Examples

# Example dataset
data(cbass_dataset)

# Example grouping properties
grouping_properties <- c("Site", "Condition", "Species", "Timepoint")

# Process the dataset
processed_dataset <- process_dataset(cbass_dataset, grouping_properties)

Read Data Function

Description

Reads data from a file based on its format (Excel or CSV).

Usage

read_data(file_path)

Arguments

file_path

Character string specifying the path to the input file.

Details

This function determines the file format based on the file extension and uses appropriate methods to read data from either Excel (xls, xlsx) or CSV (csv, txt) files.

Value

A data frame containing the read data.

Examples

# Read data from an Excel file
read_data(system.file("extdata", "example.xlsx", package = "CBASSED50"))

# Read data from a CSV file
read_data(system.file("extdata", "example.csv", package = "CBASSED50"))


Transform Predictions to a Long-Format DataFrame

Description

This function takes a list of predictions and converts them into a long-format data frame. Each prediction corresponds to a different temperature range.

Usage

transform_predictions_to_long_dataframe(predictions, temp_range)

Arguments

predictions

A list of data frames where each data frame represents predictions for a specific temperature range. The data frames should have a common grouping property.

temp_range

the temperature range to be used for predictions from the function define_temperature_ranges

Value

A long-format data frame containing the transformed predictions with columns for "GroupingProperty," "Temperature," and "PredictedPAM."


Validate CBASS Dataset

Description

This function validates a dataset to ensure it contains all the mandatory columns required for further processing. If any mandatory columns are missing, it raises an error with the list of missing columns.

Usage

validate_cbass_dataset(dataset)

Arguments

dataset

A data frame representing the CBASS dataset to be processed and validated.

Value

A processed and validated CBASS dataset with appropriate data types for its columns.

See Also

dataset_has_mandatory_columns, convert_columns, check_enough_unique_temperatures_values

Examples

# Assuming a dataset named 'cbass_dataset' is available in the environment
data(cbass_dataset)
preprocessed_data <- preprocess_dataset(cbass_dataset)
validate_cbass_dataset(preprocessed_data)