Title: | Logic Forest |
Version: | 2.1.2 |
Depends: | R (≥ 2.10) |
Imports: | LogicReg, methods, survival |
Suggests: | data.table |
Description: | Logic Forest is an ensemble machine learning method that identifies important and interpretable combinations of binary predictors using logic regression trees to model complex relationships with an outcome. Wolf, B.J., Slate, E.H., Hill, E.G. (2010) <doi:10.1093/bioinformatics/btq354>. |
License: | GPL-3 |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.1 |
NeedsCompilation: | no |
Packaged: | 2025-06-16 14:35:33 UTC; nika02 |
Author: | Bethany Wolf [aut], Melica Nikahd [ctb, cre], Andrew Gothard [ctb], Madison Hyer [ctb] |
Maintainer: | Melica Nikahd <melica.nikahd@osumc.edu> |
Repository: | CRAN |
Date/Publication: | 2025-07-14 17:10:09 UTC |
Internal Logic Forest Functions
Description
INTERNAL FUNCTION TO CREATE PERMUTATIONS OF N VARIABLES
This function is called by TTab
. It is not intended to be used independently of this function.
Usage
Perms(n)
Arguments
n |
integer |
Value
mat matrix with each combination of n variables, where inclusion/exclusion is denoted by 0/1
Truth table
Description
Internal function to evaluate importance of predictor combinations
Usage
TTab(data, tree, Xs, mtype)
Arguments
data |
Input data. |
tree |
Logic tree constructed for a sample. |
Xs |
Input predictors. |
mtype |
Model type: classification or linear or survival regression. |
Details
Creates a matrix of all interactions contained in one sample's logic tree.
Value
mat.truth
: Matrix of all interactions contained in one sample's logic tree
Author(s)
Bethany J. Wolf wolfb@musc.edu
References
Wolf BJ, Hill EG, Slate EH. Logic Forest: an ensemble classifier for discovering logical combinations of binary markers. Bioinformatics. 2010;26(17):2183-2189. doi:10.1093/bioinformatics/btq354
See Also
Function to find complement of the original tree and find complement PIs ### Internal Logic Forest Functions
Description
Function to find complement of the original tree and find complement PIs
This function is called by pimp.import
. It is not intended to be used independently of this function.
Usage
find.ctree(tree)
Arguments
tree |
An object of class |
Value
ctree An object of class "logregtree"
that is the complement of tree
INTERNAL FUNCTION TO EVALUATE IMPORTANCE OF PREDICTOR COMBINATIONS
Description
INTERNAL FUNCTION TO EVALUATE IMPORTANCE OF PREDICTOR COMBINATIONS
Usage
frame.logreg2(fit, msz, ntr, newbin, newresp, newsep, newcens, newweight)
Arguments
fit |
An object of type |
msz |
Max number of leaves on a tree |
ntr |
Number of trees of fit |
newbin |
new binary variables |
newresp |
new response variable |
newsep |
new number of separate predictors |
newcens |
new censoring indicators |
newweight |
new weightings |
Details
This function is called by predict.logreg2
. It is not intended to be used independently.
Value
outframe
Logic Forest & Logic Survival Forest
Description
Constructs an ensemble of logic regression models using bagging for classification or regression, and identifies important predictors and interactions. Logic Forest (LF) efficiently searches the space of logical combinations of binary variables using simulated annealing. It has been extended to support linear and survival regression.
Usage
logforest(
resp.type,
resp,
resp.time = data.frame(X = rep(1, nrow(resp))),
Xs,
nBSXVars,
anneal.params,
nBS = 100,
h = 0.5,
norm = TRUE,
numout = 5,
nleaves
)
Arguments
resp.type |
String indicating regression type: |
resp |
Numeric vector of response values (binary for classification/survival, continuous for linear regression). For time-to-event, indicates event/censoring status. |
resp.time |
Numeric vector of event/censoring times (used only for survival models). |
Xs |
Matrix or data frame of binary predictor variables (0/1 only). |
nBSXVars |
Integer. Number of predictors sampled for each tree (default is all predictors). |
anneal.params |
A list of parameters for simulated annealing (see |
nBS |
Number of trees to fit in the logic forest. |
h |
Numeric. Minimum proportion of trees predicting "1" required to classify an observation as "1" (used for classification). |
norm |
Logical. If |
numout |
Integer. Number of predictors and interactions to report. |
nleaves |
Integer. Maximum number of leaves (end nodes) allowed per tree. |
Details
Logic Forest is designed to identify interactions between binary predictors without requiring their pre-specification. Using simulated annealing, it searches the space of all possible logical combinations (e.g., AND, OR, NOT) among predictors. Originally developed for binary outcomes in gene-environment interaction studies, it has since been extended to linear and time-to-event outcomes (Logic Survival Forest).
Value
A logforest
object containing:
- Predictor.frequency
Frequency of each predictor across trees.
- Predictor.importance
Importance of each predictor.
- PI.frequency
Frequency of each interaction across trees.
- PI.importance
Importance of each interaction.
Note
Development of Logic Forest was supported by NIH/NCATS UL1RR029882. Logic Survival Forest development was supported by NIH/NIA R01AG082873.
Author(s)
Bethany J. Wolf wolfb@musc.edu
J. Madison Hyer madison.hyer@osumc.edu
References
Wolf BJ, Hill EG, Slate EH. (2010). Logic Forest: An ensemble classifier for discovering logical combinations of binary markers. Bioinformatics, 26(17):2183–2189. doi:10.1093/bioinformatics/btq354
Wolf BJ et al. (2012). LBoost: A boosting algorithm with application for epistasis discovery. PLoS One, 7(11):e47281. doi:10.1371/journal.pone.0047281
Hyer JM et al. (2019). Novel Machine Learning Approach to Identify Preoperative Risk Factors Associated With Super-Utilization of Medicare Expenditure Following Surgery. JAMA Surg, 154(11):1014–1021. doi:10.1001/jamasurg.2019.2979
See Also
pimp.import
, logreg.anneal.control
Examples
## Not run:
set.seed(10051988)
N_c <- 50
N_r <- 200
init <- as.data.frame(matrix(0, nrow = N_r, ncol = N_c))
colnames(init) <- paste0("X", 1:N_c)
for(n in 1:N_c){
p <- runif(1, min = 0.2, max = 0.6)
init[,n] <- rbinom(N_r, 1, p)
}
X3X4int <- as.numeric(init$X3 == init$X4)
X5X6int <- as.numeric(init$X5 == init$X6)
y_p <- -2.5 + init$X1 + init$X2 + 2 * X3X4int + 2 * X5X6int
p <- 1 / (1 + exp(-y_p))
init$Y.bin <- rbinom(N_r, 1, p)
# Classification
LF.fit.bin <- logforest("bin", init$Y.bin, NULL, init[,1:N_c], nBS=10, nleaves=8, numout=10)
print(LF.fit.bin)
# Continuous
init$Y.cont <- rnorm(N_r, mean = 0) + init$X1 + init$X2 + 5 * X3X4int + 5 * X5X6int
LF.fit.lin <- logforest("lin", init$Y.cont, NULL, init[,1:N_c], nBS=10, nleaves=8, numout=10)
print(LF.fit.lin)
# Time-to-event
shape <- 1 - 0.05*init$X1 - 0.05*init$X2 - 0.2*init$X3*init$X4 - 0.2*init$X5*init$X6
scale <- 1.5 - 0.05*init$X1 - 0.05*init$X2 - 0.2*init$X3*init$X4 - 0.2*init$X5*init$X6
init$TIME_Y <- rgamma(N_r, shape = shape, scale = scale)
LF.fit.surv <- logforest("exp_surv", init$Y.bin, init$TIME_Y, init[,1:N_c],
nBS=10, nleaves=8, numout=10)
print(LF.fit.surv)
## End(Not run)
Internal Logic Forest Functions
Description
INTERNAL FUNCTION TO CREATE COMBINATIONS OF N.PAIR VARIABLES
This function is called by prime.imp
. It is not intended to be used independently of this function.
Usage
p.combos(n.pair, conj = 0)
Arguments
n.pair |
number of predictors in the tree |
conj |
value denoting absence of variable in combination |
Value
mat matrix with each combination of n.pair variables
Predictor Importance - Variables and Interactions
Description
Measures the predictor importance for each predictor and interaction importance for each iteration.
Usage
pimp.import(fit, data, testdata, BSpred, pred, Xs, mtype)
Arguments
fit |
Fit information including outcome/model type, input data, logic tree, etc. |
data |
In-bag sample (i.e., training data). |
testdata |
Out-of-bag sample (i.e., test data). |
BSpred |
Number of Xs in the interactions (includes not-ed variables). |
pred |
Matrix of predicted values. |
Xs |
Matrix or data frame of zeros and ones for all predictor variables. |
mtype |
Model type: |
Details
This function is called to calculate importance measures for each bootstrapped sample. Importance measures are calculated as differences between the original out-of-bag sample and a permuted out-of-bag sample. Model fit for both samples is evaluated using:
Concordance for classification,
Mean squared error for linear regression,
Harrell's C-index for survival regression.
Value
A list with:
- single.vimp
Vector of predictor importance estimates. One estimate per predictor used in the tree of that sample.
- pimp.vimp
Vector of interaction importance estimates. One estimate per interaction detected in that sample.
- Ipimat
Matrix updated for each sample. Contains all predictors (and their NOT-ed versions if used) for each interaction.
- vec.Xvars
Vector of predictors used in the tree of that sample.
- Xids
Vector of predictor IDs used in the tree of that sample.
Author(s)
Bethany J. Wolf wolfb@musc.edu
J. Madison Hyer madison.hyer@osumc.edu
References
Wolf BJ, Hill EG, Slate EH. Logic Forest: an ensemble classifier for discovering logical combinations of binary markers. Bioinformatics. 2010;26(17):2183-2189. doi:10.1093/bioinformatics/btq354
See Also
Predictor Importance Matrix - Classification
Description
Internal function called in pimp.import
to construct a binary/logical matrix of predictors used (columns) in each of the interactions of a sample (rows).
Usage
pimp.mat.bin(pimps.out, testdata)
Arguments
pimps.out |
R object containing |
testdata |
Out-of-bag sample. |
Details
NOTE: For regression models, pimp.mat.nonbin
is used to accommodate complements of logic trees.
Value
A list with:
- pimp.names
Vector of predictor names.
- pimp.datamat
Logical matrix of predictors used (columns) in each of the interactions of a sample (rows).
Author(s)
Bethany J. Wolf wolfb@musc.edu
References
Wolf BJ, Hill EG, Slate EH. Logic Forest: an ensemble classifier for discovering logical combinations of binary markers. Bioinformatics. 2010;26(17):2183-2189. doi:10.1093/bioinformatics/btq354
See Also
Predictor Importance Matrix - Regression
Description
Internal function called in pimp.import
to construct a binary/logical matrix of predictors used (columns) in each of the interactions of a sample (rows).
Usage
pimp.mat.nonbin(pimps.out, testdata)
Arguments
pimps.out |
R object containing |
testdata |
Out-of-bag sample. |
Details
NOTE: For classification models, pimp.mat.bin
is used.
Value
A list with:
- pimp.names
Vector of predictor names.
- pimp.datamat
Logical matrix of predictors used (columns) in each of the interactions of a sample (rows).
Author(s)
Bethany J. Wolf wolfb@musc.edu
References
Wolf BJ, Hill EG, Slate EH. Logic Forest: an ensemble classifier for discovering logical combinations of binary markers. Bioinformatics. 2010;26(17):2183-2189. doi:10.1093/bioinformatics/btq354
See Also
Predict method for logic forest models
Description
Predicts outcomes for new observations using a fitted logic forest model.
Usage
## S3 method for class 'logforest'
predict(object, newdata, cutoff, ...)
Arguments
object |
An object of class |
newdata |
A data frame containing new observations to predict. |
cutoff |
A numeric value indicating the proportion of trees that must predict class 1 for an overall prediction of class 1. |
... |
Additional arguments (currently ignored). |
Value
An object of class "LFprediction"
containing the predicted outcomes.
Internal Logic Forest Functions
Description
INTERNAL FUNCTION TO EVALUATE IMPORTANCE OF PREDICTOR COMBINATIONS
Usage
## S3 method for class 'logreg2'
predict(object, msz, ntr, newbin, newsep, newcens, ...)
Arguments
object |
An object of class |
msz |
Max number of leaves on a tree |
ntr |
Number of trees in object |
newbin |
matrix containing new datapoints for prediction |
newsep |
new number of separate predictors |
newcens |
censoring indicator |
Value
outframe
Internal Logic Forest Functions
Description
This function is called by pimp.import
. It is not intended to be used independently of these functions.
Usage
prime.imp(tree, data, Xs, mtype)
Arguments
tree |
An object of class |
data |
Data used to fit the logforest |
Xs |
Predictors |
mtype |
Model type |
Value
ListPI An object of class "primeImp"
containing all variables and variable interactions
Print method for logic forest predictions
Description
Displays the results of a prediction from a logic forest model.
Usage
## S3 method for class 'LFprediction'
print(x, ...)
Arguments
x |
An object of class |
... |
Additional arguments (currently ignored). |
Value
No return value. This function is called for its side effects (printing).
Print method for logic forest models
Description
Prints the important predictors from a fitted logic forest model.
Usage
## S3 method for class 'logforest'
print(x, sortby = "importance", ...)
Arguments
x |
An object of class |
sortby |
A character string specifying whether to sort the predictors by |
... |
Additional arguments (currently ignored). |
Value
No return value. This function is called for its side effects (printing).
Internal Logic Forest Functions
Description
This function is called by predict.logforest
. It is not intended to be used independently of this function.
It determined the proportion of logic regression trees within a logic forest that predict a class of one for new observations. Additionally it returns predicted values for new observations.
Usage
proportion.positive(predictmatrix, cutoff)
Arguments
predictmatrix |
matrix of predicted values |
cutoff |
proportion of trees that predict class of one required to result in prediction of class of one |
Value
ans list of predicted values and proportion of trees that predict class of one