Title: | Exploratory Graph Analysis – a Framework for Estimating the Number of Dimensions in Multivariate Data using Network Psychometrics |
Version: | 2.3.0 |
Date: | 2025-04-09 |
Maintainer: | Hudson Golino <hfg9s@virginia.edu> |
Description: | Implements the Exploratory Graph Analysis (EGA) framework for dimensionality and psychometric assessment. EGA estimates the number of dimensions in psychological data using network estimation methods and community detection algorithms. A bootstrap method is provided to assess the stability of dimensions and items. Fit is evaluated using the Entropy Fit family of indices. Unique Variable Analysis evaluates the extent to which items are locally dependent (or redundant). Network loadings provide similar information to factor loadings and can be used to compute network scores. A bootstrap and permutation approach are available to assess configural and metric invariance. Hierarchical structures can be detected using Hierarchical EGA. Time series and intensive longitudinal data can be analyzed using Dynamic EGA, supporting individual, group, and population level assessments. |
Depends: | R (≥ 3.5.0) |
License: | AGPL (≥ 3.0) |
Encoding: | UTF-8 |
LazyData: | true |
Imports: | dendextend, fungible, future, future.apply, GGally, ggplot2, ggpubr, glasso, glassoFast, GPArotation, igraph (≥ 1.3.0), lavaan, Matrix, methods, network, progressr, qgraph, semPlot, sna, stats |
Suggests: | DEoptim, fitdistrplus, gridExtra, knitr, markdown, pbapply, progress, psych, pwr, RColorBrewer |
URL: | https://r-ega.net |
BugReports: | https://github.com/hfgolino/EGAnet/issues |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | yes |
Packaged: | 2025-04-09 17:23:43 UTC; alextops |
Author: | Hudson Golino |
Repository: | CRAN |
Date/Publication: | 2025-04-09 23:10:15 UTC |
EGAnet-package
Description
Implements the Exploratory Graph Analysis (EGA) framework for dimensionality and psychometric assessment. EGA estimates the number of dimensions in psychological data using network estimation methods and community detection algorithms. A bootstrap method is provided to assess the stability of dimensions and items. Fit is evaluated using the Entropy Fit family of indices. Unique Variable Analysis evaluates the extent to which items are locally dependent (or redundant). Network loadings provide similar information to factor loadings and can be used to compute network scores. A bootstrap and permutation approach are available to assess configural and metric invariance. Hierarchical structures can be detected using Hierarchical EGA. Time series and intensive longitudinal data can be analyzed using Dynamic EGA, supporting individual, group, and population level assessments.
Author(s)
Hudson Golino <hfg9s@virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
References
Christensen, A. P. (2023).
Unidimensional community detection: A Monte Carlo simulation, grid search, and comparison.
PsyArXiv.
# Related functions: community.unidimensional
Christensen, A. P., Garrido, L. E., & Golino, H. (2023).
Unique variable analysis: A network psychometrics method to detect local dependence.
Multivariate Behavioral Research.
# Related functions: UVA
Christensen, A. P., Garrido, L. E., Guerra-Pena, K., & Golino, H. (2023).
Comparing community detection algorithms in psychometric networks: A Monte Carlo simulation.
Behavior Research Methods.
# Related functions: EGA
Christensen, A. P., & Golino, H. (2021a).
Estimating the stability of the number of factors via Bootstrap Exploratory Graph Analysis: A tutorial.
Psych, 3(3), 479-500.
# Related functions: bootEGA
, dimensionStability
,
# and itemStability
Christensen, A. P., & Golino, H. (2021b).
Factor or network model? Predictions from neural networks.
Journal of Behavioral Data Science, 1(1), 85-126.
# Related functions: LCT
Christensen, A. P., & Golino, H. (2021c).
On the equivalency of factor and network loadings.
Behavior Research Methods, 53, 1563-1580.
# Related functions: LCT
and net.loads
Christensen, A. P., Golino, H., & Silvia, P. J. (2020).
A psychometric network perspective on the validity and validation of personality trait questionnaires.
European Journal of Personality, 34, 1095-1108.
# Related functions: bootEGA
, dimensionStability
,
# EGA
, itemStability
, and UVA
Christensen, A. P., Gross, G. M., Golino, H., Silvia, P. J., & Kwapil, T. R. (2019).
Exploratory graph analysis of the Multidimensional Schizotypy Scale.
Schizophrenia Research, 206, 43-51.
# Related functions: CFA
and EGA
Golino, H., Christensen, A. P., Moulder, R., Kim, S., & Boker, S. M. (2021).
Modeling latent topics in social media using Dynamic Exploratory Graph Analysis: The case of the right-wing and left-wing trolls in the 2016 US elections.
Psychometrika.
# Related functions: dynEGA
and simDFM
Golino, H., & Demetriou, A. (2017).
Estimating the dimensionality of intelligence like data using Exploratory Graph Analysis.
Intelligence, 62, 54-70.
# Related functions: EGA
Golino, H., & Epskamp, S. (2017).
Exploratory graph analysis: A new approach for estimating the number of dimensions in psychological research.
PLoS ONE, 12, e0174035.
# Related functions: CFA
, EGA
, and bootEGA
Golino, H., Moulder, R., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., Nesselroade, J., Sadana, R., Thiyagarajan, J. A., & Boker, S. M. (2020).
Entropy fit indices: New fit measures for assessing the structure and dimensionality of multiple latent variables.
Multivariate Behavioral Research.
# Related functions: entropyFit
, tefi
, and vn.entropy
Golino, H., Nesselroade, J. R., & Christensen, A. P. (2022).
Towards a psychology of individuals: The ergodicity information index and a bottom-up approach for finding generalizations.
PsyArXiv.
# Related functions: boot.ergoInfo
, ergoInfo
,
jsd
, and infoCluster
Golino, H., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., Sadana, R., Thiyagarajan, J. A., & Martinez-Molina, A. (2020).
Investigating the performance of exploratory graph analysis and traditional techniques to identify the number of latent factors:
A simulation and tutorial.
Psychological Methods, 25, 292-320.
# Related functions: EGA
Golino, H., Thiyagarajan, J. A., Sadana, M., Teles, M., Christensen, A. P., & Boker, S. M. (2020).
Investigating the broad domains of intrinsic capacity, functional ability, and environment:
An exploratory graph analysis approach for improving analytical methodologies for measuring healthy aging.
PsyArXiv.
# Related functions: EGA.fit
and tefi
Jamison, L., Christensen, A. P., & Golino, H. (2021).
Optimizing Walktrap's community detection in networks using the Total Entropy Fit Index.
PsyArXiv.
# Related functions: EGA.fit
and tefi
Jamison, L., Golino, H., & Christensen, A. P. (2023).
Metric invariance in exploratory graph analysis via permutation testing.
PsyArXiv.
# Related functions: invariance
Shi, D., Christensen, A. P., Day, E., Golino, H., & Garrido, L. E. (2023).
A Bayesian approach for dimensionality assessment in psychological networks.
PsyArXiv
# Related functions: EGA
See Also
Useful links:
Report bugs at https://github.com/hfgolino/EGAnet/issues
CFA Fit of EGA
or hierEGA
Structure
Description
Verifies the fit of the structure suggested by
EGA
or by hierEGA
using
confirmatory factor analysis
Usage
CFA(ega.obj, data, estimator, plot.CFA = TRUE, layout = "spring", ...)
Arguments
ega.obj |
|
data |
Matrix or data frame. Should consist only of variables to be used in the analysis |
estimator |
The estimator used in the confirmatory factor analysis.
'WLSMV' is the estimator of choice for ordinal variables.
'ML' or 'WLS' for interval variables.
See |
plot.CFA |
Logical. Should the CFA structure with its standardized loadings be plot? Defaults to TRUE |
layout |
Layout of plot (see |
... |
Arguments passed to |
Value
Returns a list containing:
fit |
Output from |
summary |
Summary output from |
fit.measures |
Fit measures: chi-squared,
degrees of freedom, p-value, CFI, RMSEA, GFI, and NFI.
Additional fit measures can be applied using the
|
Author(s)
Hudson F. Golino <hfg9s at virginia.edu>
References
Demonstrative use
Christensen, A. P., Gross, G. M., Golino, H., Silvia, P. J., & Kwapil, T. R. (2019).
Exploratory graph analysis of the Multidimensional Schizotypy Scale.
Schizophrenia Research, 206, 43-51.
Initial implementation
Golino, H., & Epskamp, S. (2017).
Exploratory graph analysis: A new approach for estimating the number of dimensions in psychological research.
PLoS ONE, 12, e0174035.
Examples
# Load data
wmt <- wmt2[,7:24]
## Not run:
# Estimate EGA
ega.wmt <- EGA(
data = wmt,
plot.EGA = FALSE # No plot for CRAN checks
)
# Fit CFA model to EGA results
cfa.wmt <- CFA(
ega.obj = ega.wmt, estimator = "WLSMV",
plot.CFA = FALSE, # No plot for CRAN checks
data = wmt
)
# Additional fit measures
lavaan::fitMeasures(cfa.wmt$fit, fit.measures = "all")
## End(Not run)
EBICglasso
from qgraph
1.4.4
Description
This function uses the glasso
package
(Friedman, Hastie and Tibshirani, 2011) to compute a
sparse gaussian graphical model with the graphical lasso
(Friedman, Hastie & Tibshirani, 2008).
The tuning parameter is chosen using the Extended Bayesian Information criterion
(EBIC) described by Foygel & Drton (2010).
Usage
EBICglasso.qgraph(
data,
n = NULL,
corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
na.data = c("pairwise", "listwise"),
gamma = 0.5,
penalize.diagonal = FALSE,
nlambda = 100,
lambda.min.ratio = 0.1,
fast = FALSE,
returnAllResults = FALSE,
penalizeMatrix = NULL,
countDiagonal = FALSE,
refit = FALSE,
model.selection = c("EBIC", "JSD"),
verbose = FALSE,
...
)
Arguments
data |
Matrix or data frame. Should consist only of variables to be used in the analysis |
n |
Numeric (length = 1).
Sample size if |
corr |
Character (length = 1).
Method to compute correlations.
Defaults to
For other similarity measures, compute them first and input them
into |
na.data |
Character (length = 1).
How should missing data be handled?
Defaults to
|
gamma |
Numeric (length = 1)
EBIC tuning parameter.
Defaults to |
penalize.diagonal |
Boolean (length = 1).
Should the diagonal be penalized?
Defaults to |
nlambda |
Numeric (length = 1).
Number of lambda values to test.
Defaults to |
lambda.min.ratio |
Numeric (length = 1).
Ratio of lowest lambda value compared to maximal lambda.
Defaults to |
fast |
Boolean (length = 1).
Whether the The fast results may differ by less than floating point of the original
GLASSO implemented by |
returnAllResults |
Boolean (length = 1).
Whether all results should be returned.
Defaults to |
penalizeMatrix |
Boolean matrix. Optional logical matrix to indicate which elements are penalized |
countDiagonal |
Boolean (length = 1).
Should diagonal be counted in EBIC computation?
Defaults to |
refit |
Boolean (length = 1).
Should the optimal graph be refitted without LASSO regularization?
Defaults to |
model.selection |
Character (length = 1).
How lambda should be selected within GLASSO.
Defaults to |
verbose |
Boolean (length = 1).
Whether messages and (insignificant) warnings should be output.
Defaults to |
... |
Arguments sent to |
Details
The glasso is run for 100 values of the tuning parameter logarithmically
spaced between the maximal value of the tuning parameter at which all edges are zero,
lambda_max, and lambda_max/100. For each of these graphs the EBIC is computed and
the graph with the best EBIC is selected. The partial correlation matrix
is computed using wi2net
and returned.
Value
A partial correlation matrix
Author(s)
Sacha Epskamp; for maintanence, Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen at gmail.com>
References
Instantiation of GLASSO
Friedman, J., Hastie, T., & Tibshirani, R. (2008).
Sparse inverse covariance estimation with the graphical lasso.
Biostatistics, 9, 432-441.
glasso + EBIC
Foygel, R., & Drton, M. (2010).
Extended Bayesian information criteria for Gaussian graphical models.
In Advances in neural information processing systems (pp. 604-612).
glasso package
Friedman, J., Hastie, T., & Tibshirani, R. (2011).
glasso: Graphical lasso-estimation of Gaussian graphical models.
R package version 1.7.
Tutorial on EBICglasso
Epskamp, S., & Fried, E. I. (2018).
A tutorial on regularized partial correlation networks.
Psychological Methods, 23(4), 617–634.
Examples
# Obtain data
wmt <- wmt2[,7:24]
# Fast
fast <- EBICglasso.qgraph(wmt)
# Regular
regular <- EBICglasso.qgraph(wmt, fast = FALSE)
# Difference between fast and regular
sum(abs(fast - regular))
# Compute graph with tuning = 0 (BIC)
BICgraph <- EBICglasso.qgraph(data = wmt, gamma = 0)
# Compute graph with tuning = 0.5 (EBIC)
EBICgraph <- EBICglasso.qgraph(data = wmt, gamma = 0.5)
Exploratory Graph Analysis
Description
Estimates the number of communities (dimensions) of a dataset or correlation matrix using a network estimation method (Golino & Epskamp, 2017; Golino et al., 2020). After, a community detection algorithm is applied (Christensen et al., 2023) for multidimensional data. A unidimensional check is also applied based on findings from Golino et al. (2020) and Christensen (2023)
Usage
EGA(
data,
n = NULL,
corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
na.data = c("pairwise", "listwise"),
model = c("BGGM", "glasso", "TMFG"),
algorithm = c("leiden", "louvain", "walktrap"),
uni.method = c("expand", "LE", "louvain"),
plot.EGA = TRUE,
verbose = FALSE,
...
)
Arguments
data |
Matrix or data frame. Should consist only of variables to be used in the analysis. Can be raw data or a correlation matrix |
n |
Numeric (length = 1).
Sample size if |
corr |
Character (length = 1).
Method to compute correlations.
Defaults to
For other similarity measures, compute them first and input them
into |
na.data |
Character (length = 1).
How should missing data be handled?
Defaults to
|
model |
Character (length = 1).
Defaults to
|
algorithm |
Character or
|
uni.method |
Character (length = 1).
What unidimensionality method should be used?
Defaults to
|
plot.EGA |
Boolean (length = 1).
Defaults to |
verbose |
Boolean (length = 1).
Whether messages and (insignificant) warnings should be output.
Defaults to |
... |
Additional arguments to be passed on to
|
Value
Returns a list containing:
network |
A matrix containing a network estimated using
|
wc |
A vector representing the community (dimension) membership
of each node in the network. |
n.dim |
A scalar of how many total dimensions were identified in the network |
correlation |
The zero-order correlation matrix |
n |
Number of cases in |
dim.variables |
An ordered matrix of item allocation |
TEFI |
|
plot.EGA |
Plot output if |
Author(s)
Hudson Golino <hfg9s at virginia.edu>, Alexander P. Christensen <alexpaulchristensen at gmail.com>, Maria Dolores Nieto <acinodam at gmail.com> and Luis E. Garrido <garrido.luiseduardo at gmail.com>
References
Original simulation and implementation of EGA
Golino, H. F., & Epskamp, S. (2017).
Exploratory graph analysis: A new approach for estimating the number of dimensions in psychological research.
PLoS ONE, 12, e0174035.
Current implementation of EGA, introduced unidimensional checks, continuous and dichotomous data
Golino, H., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., Sadana, R., & Thiyagarajan, J. A. (2020).
Investigating the performance of Exploratory Graph Analysis and traditional techniques to identify the number of latent factors: A simulation and tutorial.
Psychological Methods, 25, 292-320.
Compared all igraph community detection algorithms, introduced Louvain algorithm, simulation with continuous and polytomous data
Also implements the Leading Eigenvalue unidimensional method
Christensen, A. P., Garrido, L. E., Pena, K. G., & Golino, H. (2023).
Comparing community detection algorithms in psychological data: A Monte Carlo simulation.
Behavior Research Methods.
Comprehensive unidimensionality simulation
Christensen, A. P. (2023).
Unidimensional community detection: A Monte Carlo simulation, grid search, and comparison.
PsyArXiv.
Compared all igraph
community detection algorithms, simulation with continuous and polytomous data
Christensen, A. P., Garrido, L. E., Guerra-Pena, K., & Golino, H. (2023).
Comparing community detection algorithms in psychometric networks: A Monte Carlo simulation.
Behavior Research Methods.
See Also
plot.EGAnet
for plot usage in EGAnet
Examples
# Obtain data
wmt <- wmt2[,7:24]
# Estimate EGA
ega.wmt <- EGA(
data = wmt,
plot.EGA = FALSE # No plot for CRAN checks
)
# Print results
print(ega.wmt)
# Estimate EGAtmfg
ega.wmt.tmfg <- EGA(
data = wmt, model = "TMFG",
plot.EGA = FALSE # No plot for CRAN checks
)
# Estimate EGA with Louvain algorithm
ega.wmt.louvain <- EGA(
data = wmt, algorithm = "louvain",
plot.EGA = FALSE # No plot for CRAN checks
)
# Estimate EGA with an {igraph} function (Fast-greedy)
ega.wmt.greedy <- EGA(
data = wmt,
algorithm = igraph::cluster_fast_greedy,
plot.EGA = FALSE # No plot for CRAN checks
)
Estimates EGA
for Multidimensional Structures
Description
A basic function to estimate EGA
for multidimensional structures.
This function does not include the unidimensional check and it does not
plot the results. This function can be used as a streamlined approach
for quick EGA
estimation when unidimensionality or visualization
is not a priority
Usage
EGA.estimate(
data,
n = NULL,
corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
na.data = c("pairwise", "listwise"),
model = c("BGGM", "glasso", "TMFG"),
algorithm = c("leiden", "louvain", "walktrap"),
verbose = FALSE,
...
)
Arguments
data |
Matrix or data frame. Should consist only of variables to be used in the analysis |
n |
Numeric (length = 1).
Sample size if |
corr |
Character (length = 1).
Method to compute correlations.
Defaults to
For other similarity measures, compute them first and input them
into |
na.data |
Character (length = 1).
How should missing data be handled?
Defaults to
|
model |
Character (length = 1).
Defaults to
|
algorithm |
Character or
|
verbose |
Boolean (length = 1).
Whether messages and (insignificant) warnings should be output.
Defaults to |
... |
Additional arguments to be passed on to
|
Value
Returns a list containing:
network |
A matrix containing a network estimated using
|
wc |
A vector representing the community (dimension) membership
of each node in the network. |
n.dim |
A scalar of how many total dimensions were identified in the network |
cor.data |
The zero-order correlation matrix |
n |
Number of cases in |
Author(s)
Alexander P. Christensen <alexpaulchristensen at gmail.com> and Hudson Golino <hfg9s at virginia.edu>
References
Original simulation and implementation of EGA
Golino, H. F., & Epskamp, S. (2017).
Exploratory graph analysis: A new approach for estimating the number of dimensions in psychological research.
PLoS ONE, 12, e0174035.
Introduced unidimensional checks, simulation with continuous and dichotomous data
Golino, H., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., Sadana, R., & Thiyagarajan, J. A. (2020).
Investigating the performance of Exploratory Graph Analysis and traditional techniques to identify the number of latent factors: A simulation and tutorial.
Psychological Methods, 25, 292-320.
Compared all igraph
community detection algorithms, simulation with continuous and polytomous data
Christensen, A. P., Garrido, L. E., Guerra-Pena, K., & Golino, H. (2023).
Comparing community detection algorithms in psychometric networks: A Monte Carlo simulation.
Behavior Research Methods.
See Also
plot.EGAnet
for plot usage in EGAnet
Examples
# Obtain data
wmt <- wmt2[,7:24]
# Estimate EGA
ega.wmt <- EGA.estimate(data = wmt)
# Estimate EGA with TMFG
ega.wmt.tmfg <- EGA.estimate(data = wmt, model = "TMFG")
# Estimate EGA with an {igraph} function (Fast-greedy)
ega.wmt.greedy <- EGA.estimate(
data = wmt,
algorithm = igraph::cluster_fast_greedy
)
EGA
Optimal Model Fit using the Total Entropy Fit Index (tefi
)
Description
Estimates the best fitting model using EGA
.
The number of steps in the cluster_walktrap
detection
algorithm is varied and unique community solutions are compared using
tefi
.
Usage
EGA.fit(
data,
n = NULL,
corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
na.data = c("pairwise", "listwise"),
model = c("BGGM", "glasso", "TMFG"),
algorithm = c("leiden", "louvain", "walktrap"),
plot.EGA = TRUE,
verbose = FALSE,
...
)
Arguments
data |
Matrix or data frame. Should consist only of variables to be used in the analysis |
n |
Numeric (length = 1).
Sample size if |
corr |
Character (length = 1).
Method to compute correlations.
Defaults to
For other similarity measures, compute them first and input them
into |
na.data |
Character (length = 1).
How should missing data be handled?
Defaults to
|
model |
Character (length = 1).
Defaults to
|
algorithm |
Character or
|
plot.EGA |
Boolean.
If |
verbose |
Boolean.
Whether messages and (insignificant) warnings should be output.
Defaults to |
... |
Additional arguments to be passed on to
|
Value
Returns a list containing:
EGA |
|
EntropyFit |
|
Lowest.EntropyFit |
The best fitting solution based on |
parameter.space |
Parameter values used in search space |
Author(s)
Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
References
Entropy fit measures
Golino, H., Moulder, R. G., Shi, D., Christensen, A. P., Garrido, L. E., Neito, M. D., Nesselroade, J., Sadana, R., Thiyagarajan, J. A., & Boker, S. M. (in press).
Entropy fit indices: New fit measures for assessing the structure and dimensionality of multiple latent variables.
Multivariate Behavioral Research.
Simulation for EGA.fit
Jamison, L., Christensen, A. P., & Golino, H. (under review).
Optimizing Walktrap's community detection in networks using the Total Entropy Fit Index.
PsyArXiv.
Leiden algorithm
Traag, V. A., Waltman, L., & Van Eck, N. J. (2019).
From Louvain to Leiden: guaranteeing well-connected communities.
Scientific Reports, 9(1), 1-12.
Louvain algorithm
Blondel, V. D., Guillaume, J. L., Lambiotte, R., & Lefebvre, E. (2008).
Fast unfolding of communities in large networks.
Journal of Statistical Mechanics: Theory and Experiment, 2008(10), P10008.
Walktrap algorithm
Pons, P., & Latapy, M. (2006).
Computing communities in large networks using random walks.
Journal of Graph Algorithms and Applications, 10, 191-218.
See Also
plot.EGAnet
for plot usage in EGAnet
Examples
# Load data
wmt <- wmt2[,7:24]
# Estimate optimal EGA with Walktrap
fit.walktrap <- EGA.fit(
data = wmt, algorithm = "walktrap",
steps = 3:8, # default
plot.EGA = FALSE # no plot for CRAN checks
)
# Estimate optimal EGA with Leiden and CPM
fit.leiden <- EGA.fit(
data = wmt, algorithm = "leiden",
objective_function = "CPM", # default
# resolution_parameter = seq.int(0, max(abs(network)), 0.01),
# For CPM, the default max resolution parameter
# is set to the largest absolute edge in the network
plot.EGA = FALSE # no plot for CRAN checks
)
# Estimate optimal EGA with Leiden and modularity
fit.leiden <- EGA.fit(
data = wmt, algorithm = "leiden",
objective_function = "modularity",
resolution_parameter = seq.int(0, 2, 0.05),
# default for modularity
plot.EGA = FALSE # no plot for CRAN checks
)
## Not run:
# Estimate optimal EGA with Louvain
fit.louvain <- EGA.fit(
data = wmt, algorithm = "louvain",
resolution_parameter = seq.int(0, 2, 0.05), # default
plot.EGA = FALSE # no plot for CRAN checks
)
## End(Not run)
S3 Plot Methods for EGAnet
Description
General usage for plots created by EGAnet
's S3 methods.
Plots across the EGAnet
package leverage GGally
's ggnet2
and ggplot2
's ggplot
.
Most plots allow the full usage of the gg*
series functionality and therefore
plotting arguments should be referenced through those packages rather than here in
EGAnet
.
The sections below list the functions and their usage for the S3 plot methods.
The plot methods are intended to be generic and without many arguments so that
nearly all arguments are passed to ggnet2
and ggplot
.
There are some constraints placed on certain plots to keep the EGAnet
style
throughout the (network) plots in the package, so be aware that if some settings are
not changing your plot output, then these settings might be fixed
to maintain the EGAnet
style
General Usage
plot(x, ...) plot.dynEGA(x, base = 1, id = NULL, ...) plot.dynEGA.Group(x, base = 1, ...) plot.dynEGA.Individual(x, base = 1, id = NULL, ...) plot.hierEGA( x, plot.type = c("multilevel", "separate"), color.match = FALSE, ... ) plot.invariance(x, p_type = c("p", "p_BH"), p_value = 0.05, ...) plot.TEFI.compare(x, base.name, comparison.name, base.color, comparison.color, ...)
General Arguments
-
x
—EGAnet
object with available S3 plot method (see full list below) -
color.palette
— Character (vector). Either a character (length = 1) from the pre-defined palettes incolor_palette_EGA
or character (length = total number of communities) using HEX codes (see Color Palettes and Examples sections) -
layout
— Character (length = 1). Layouts can be set usinggplot.layout
and the ending layout name; for example,gplot.layout.circle
can be set in these functions usinglayout = "circle"
ormode = "circle"
(see Examples) -
base
— Numeric (length = 1). Plot to be used as the base for the configuration of the networks. Uses the number of the order in which the plots are input. Defaults to1
or the first plot -
id
— Numeric index(es) or character name(s). IDs to use when plottingdynEGA
level = "individual"
. Defaults toNULL
or 4 IDs drawn at random -
plot.type
— Character (length = 1). WhetherhierEGA
networks should plotted in a stacked,"multilevel"
fashion or as"separate"
plots. Defaults to"multilevel"
-
color.match
— Boolean (length = 1). Whether lower order community colors in thehierEGA
plot should be "matched" and used as the border color for the higher order communities. Defaults toFALSE
-
p_type
— Character (length = 1). Type of p-value when plottinginvariance
. Defaults to"p"
or uncorrected p-value. Set to"p_BH"
for the Benjamini-Hochberg corrected p-value -
p_value
— Numeric (length = 1). The p-value to use alongsidep_type
when plottinginvariance
. Defaults to0.05
-
base.name
— Character (length = 1). A string to label thebase
structure in the plot. Defaults to"Base"
-
comparison.name
— Character (length = 1). A string to label thecomparison
structure in the plot. Defaults to"Comparison"
-
base.color
— Character (length = 1). A string to specifying the color of thebase
structure in the plot. Hex codes can be used. Defaults to"blue"
-
comparison.color
— Character (length = 1). A string to specifying the color of thecomparison
structure in the plot. Hex codes can be used. Defaults to"red"
-
...
— Additional arguments to pass on toggnet2
andgplot.layout
(see Examples)
*EGA
Plots
bootEGA
, dynEGA
,
EGA
, EGA.estimate
,
EGA.fit
, hierEGA
,
invariance
, riEGA
All Available S3 Plot Methods
boot.ergoInfo
, bootEGA
,
dynEGA
, dynEGA.Group
, dynEGA.Individual
,
dynEGA.Population
, EGA
,
EGA.estimate
, EGA.fit
,
hierEGA
, infoCluster
,
invariance
, itemStability
,
riEGA
Color Palettes
color_palette_EGA
will implement some color palettes in
EGAnet
. The main EGAnet
style palette is "polychrome"
.
This palette currently has 40 colors but there will likely be a need to expand it further
(e.g., hierEGA
demands a lot of colors).
The color.palette
argument will also accept HEX code colors that
are the same length as the number of communities in the plot.
In any network plots, the color.palette
argument can be used to
select color palettes from color_palette_EGA
as well
as those in the color scheme of RColorBrewer
For more worked examples than below, see Plots in {EGAnet}
Examples
# Using different arguments in {GGally}'s `ggnet2`
plot(ega.wmt, node.size = 6, edge.size = 4)
# Using a different layout in {sna}'s `gplot.layout`
plot(ega.wmt, layout = "circle") # 'layout' argument
plot(ega.wmt, mode = "circle") # 'mode' argument
# Using different color palettes with `color_palette_EGA`
## Pre-defined palette
plot(ega.wmt, color.palette = "blue.ridge2")
## University of Virginia colors
plot(ega.wmt, color.palette = c("#232D4B", "#F84C1E"))
## Vanderbilt University colors
## (with additional {GGally} `ggnet2` argument)
plot(
ega.wmt, color.palette = c("#FFFFFF", "#866D4B"),
label.color = "#000000"
)
Exploratory Graph Model
Description
Function to fit the Exploratory Graph Model
Usage
EGM(
data,
EGM.model = c("standard", "EGA"),
communities = NULL,
structure = NULL,
search = FALSE,
p.in = NULL,
p.out = NULL,
opt = c("AIC", "BIC", "logLik", "SRMR"),
constrain.structure = TRUE,
constrain.zeros = TRUE,
verbose = TRUE,
...
)
Arguments
data |
Matrix or data frame. Should consist only of variables to be used in the analysis. Can be raw data or a correlation matrix |
EGM.model |
Character vector (length = 1).
Sets the procedure to conduct
|
communities |
Numeric vector (length = 1).
Number of communities to use for the |
structure |
Numeric or character vector (length = |
search |
Boolean (length = 1).
Whether a search over parameters should be conducted.
Defaults to |
p.in |
Numeric vector (length = 1).
Probability that a node is randomly linked to other nodes in the same community.
Within community edges are set to zero based on |
p.out |
Numeric vector (length = 1).
Probability that a node is randomly linked to other nodes not in the same community.
Between community edges are set to zero based on |
opt |
Character vector (length = 1).
Fit index used to select from when searching over models
(only applies to
Defaults to |
constrain.structure |
Boolean (length = 1).
Whether memberships of the communities should
be added as a constraint when optimizing the network loadings.
Defaults to |
constrain.zeros |
Boolean (length = 1).
Whether zeros in the estimated network loading matrix should
be retained when optimizing the network loadings.
Defaults to |
verbose |
Boolean (length = 1).
Should progress be displayed?
Defaults to |
... |
Additional arguments to be passed on to
|
Author(s)
Hudson F. Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
Examples
# Get depression data
data <- depression[,24:44]
# Estimate EGM (using EGA)
egm_ega <- EGM(data)
# Estimate EGM (using EGA) specifying communities
egm_ega_communities <- EGM(data, communities = 3)
# Estimate EGM (using EGA) specifying structure
egm_ega_structure <- EGM(
data, structure = c(
1, 1, 1, 2, 1, 1, 1,
1, 1, 1, 3, 2, 2, 2,
2, 3, 3, 3, 3, 3, 2
)
)
# Estimate EGM (using standard)
egm_standard <- EGM(
data, EGM.model = "standard",
communities = 3, # specify number of communities
p.in = 0.95, # probability of edges *in* each community
p.out = 0.80 # probability of edges *between* each community
)
## Not run:
# Estimate EGM (using EGA search)
egm_ega_search <- EGM(
data, EGM.model = "EGA", search = TRUE
)
# Estimate EGM (using EGA search and AIC criterion)
egm_ega_search_AIC <- EGM(
data, EGM.model = "EGA", search = TRUE, opt = "AIC"
)
# Estimate EGM (using search)
egm_search <- EGM(
data, EGM.model = "standard", search = TRUE,
communities = 3, # need communities or structure
p.in = 0.95 # only need 'p.in'
)
## End(Not run)
Compare EGM
with EFA
Description
Estimates an EGM
based on EGA
and
uses the number of communities as the number of dimensions in exploratory factor analysis
(EFA) using fa
Usage
EGM.compare(
data,
constrain.structure = FALSE,
constrain.zeros = FALSE,
rotation = "geominQ",
...
)
Arguments
data |
Matrix or data frame. Should consist only of variables to be used in the analysis. Can be raw data or a correlation matrix |
constrain.structure |
Boolean (length = 1).
Whether memberships of the communities should
be added as a constraint when optimizing the network loadings.
Defaults to #' Note: This default differs from |
constrain.zeros |
Boolean (length = 1).
Whether zeros in the estimated network loading matrix should
be retained when optimizing the network loadings.
Defaults to Note: This default differs from |
rotation |
Character.
A rotation to use to obtain a simpler structure for EFA.
For a list of rotations, see |
... |
Additional arguments to be passed on to
|
Author(s)
Hudson F. Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
Examples
# Get depression data
data <- depression[,24:44]
# Compare EGM (using EGA) with EFA
## Not run:
results <- EGM.compare(data)
# Print summary
summary(results)
## End(Not run)
Time-delay Embedding
Description
Reorganizes a single observed time series into an embedded matrix. The embedded matrix is constructed with replicates of an individual time series that are offset from each other in time. The function requires two parameters, one that specifies the number of observations to be used (i.e., the number of embedded dimensions) and the other that specifies the number of observations to offset successive embeddings
Usage
Embed(x, E, tau)
Arguments
x |
Numeric vector. An observed time series to be reorganized into a time-delayed embedded matrix. |
E |
Numeric (length = 1).
Number of embedded dimensions or the number of observations to
be used. |
tau |
Numeric (length = 1).
Number of observations to offset successive embeddings.
A tau of one uses adjacent observations.
Default is |
Value
Returns a numeric matrix
Author(s)
Pascal Deboeck <pascal.deboeck at psych.utah.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
References
Deboeck, P. R., Montpetit, M. A., Bergeman, C. S., & Boker, S. M. (2009) Using derivative estimates to describe intraindividual variability at multiple time scales. Psychological Methods, 14, 367-386.
Examples
# A time series with 8 time points
time_series <- 49:56
# Time series embedding
Embed(time_series, E = 5, tau = 1)
Loadings Comparison Test
Description
An algorithm to identify whether data were generated from a factor or network model using factor and network loadings. The algorithm uses heuristics based on theory and simulation. These heuristics were then submitted to several deep learning neural networks with 240,000 samples per model with varying parameters.
Usage
LCT(
data,
n = NULL,
corr = c("auto", "cor_auto", "pearson", "spearman"),
na.data = c("pairwise", "listwise"),
model = c("BGGM", "glasso", "TMFG"),
algorithm = c("leiden", "louvain", "walktrap"),
uni.method = c("expand", "LE", "louvain"),
iter = 100,
seed = NULL,
verbose = TRUE,
...
)
Arguments
data |
Matrix or data frame. Should consist only of variables to be used in the analysis. Can be raw data or a correlation matrix |
n |
Numeric (length = 1).
Sample size if |
corr |
Character (length = 1).
Method to compute correlations.
Defaults to
For other similarity measures, compute them first and input them
into |
na.data |
Character (length = 1).
How should missing data be handled?
Defaults to
|
model |
Character (length = 1).
Defaults to
|
algorithm |
Character or
|
uni.method |
Character (length = 1).
What unidimensionality method should be used?
Defaults to
|
iter |
Numeric (length = 1).
Number of replicate samples to be drawn from a multivariate
normal distribution (uses |
seed |
Numeric (length = 1).
Defaults to |
verbose |
Boolean (length = 1).
Should progress be displayed?
Defaults to |
... |
Additional arguments that can be passed on to
|
Value
Returns a list containing:
empirical |
Prediction of model based on empirical dataset only |
bootstrap |
Prediction of model based on means of the loadings across the bootstrap replicate samples |
proportion |
Proportions of models suggested across bootstraps |
Author(s)
Hudson F. Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen at gmail.com>
References
Model training and validation
Christensen, A. P., & Golino, H. (2021).
Factor or network model? Predictions from neural networks.
Journal of Behavioral Data Science, 1(1), 85-126.
Examples
# Get data
data <- psych::bfi[,1:25]
## Not run: # Compute LCT
## Factor model
LCT(data)
## End(Not run)
Triangulated Maximally Filtered Graph
Description
Applies the Triangulated Maximally Filtered Graph (TMFG) filtering method (see Massara et al., 2016). The TMFG method uses a structural constraint that limits the number of zero-order correlations included in the network (3n - 6; where n is the number of variables). The TMFG algorithm begins by identifying four variables which have the largest sum of correlations to all other variables. Then, it iteratively adds each variable with the largest sum of three correlations to nodes already in the network until all variables have been added to the network. This structure can be associated with the inverse correlation matrix (i.e., precision matrix) to be turned into a GGM (i.e., partial correlation network) by using Local-Global Inversion Method (LoGo; see Barfuss et al., 2016 for more details). See Details for more information
Usage
TMFG(
data,
n = NULL,
corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
na.data = c("pairwise", "listwise"),
partial = FALSE,
returnAllResults = FALSE,
verbose = FALSE,
...
)
Arguments
data |
Matrix or data frame. Should consist only of variables to be used in the analysis. Can be raw data or correlation matrix |
n |
Numeric (length = 1).
Sample size for when a correlation matrix is input into |
corr |
Character (length = 1).
Method to compute correlations.
Defaults to
For other similarity measures, compute them first and input them
into |
na.data |
Character (length = 1).
How should missing data be handled?
Defaults to
|
partial |
Boolean (length = 1).
Whether partial correlations should be output.
Defaults to |
returnAllResults |
Boolean (length = 1).
Whether all results should be returned.
Defaults to |
verbose |
Boolean (length = 1).
Whether messages and (insignificant) warnings should be output.
Defaults to |
... |
Additional arguments to be passed on to
|
Details
The TMFG method applies a structural constraint on the network, which restrains the network to retain a certain number of edges (3n-6, where n is the number of nodes; Massara et al., 2016). The network is also composed of 3- and 4-node cliques (i.e., sets of connected nodes; a triangle and tetrahedron, respectively). The TMFG method constructs a network using zero-order correlations and the resulting network can be associated with the inverse covariance matrix (yielding a GGM; Barfuss, Massara, Di Matteo, & Aste, 2016). Notably, the TMFG can use any association measure and thus does not assume the data is multivariate normal.
Construction begins by forming a tetrahedron of the four nodes that have the highest sum of correlations that are greater than the average correlation in the correlation matrix. Next, the algorithm iteratively identifies the node that maximizes its sum of correlations to a connected set of three nodes (triangles) already included in the network and then adds that node to the network. The process is completed once every node is connected in the network. In this process, the network automatically generates what's called a planar network. A planar network is a network that could be drawn on a sphere with no edges crossing (often, however, the networks are depicted with edges crossing; Tumminello, Aste, Di Matteo, & Mantegna, 2005).
Value
Returns a network or list containing:
network |
The filtered adjacency matrix |
separators |
The separators (3-cliques) in the network |
cliques |
The cliques (4-cliques) in the network |
Author(s)
Alexander Christensen <alexpaulchristensen@gmail.com>
References
Local-Global Inversion Method
Barfuss, W., Massara, G. P., Di Matteo, T., & Aste, T. (2016).
Parsimonious modeling with information filtering networks.
Physical Review E, 94, 062306.
Psychometric network introduction to TMFG
Christensen, A. P., Kenett, Y. N., Aste, T., Silvia, P. J., & Kwapil, T. R. (2018).
Network structure of the Wisconsin Schizotypy Scales-Short Forms: Examining psychometric network filtering approaches.
Behavior Research Methods, 50, 2531-2550.
Triangulated Maximally Filtered Graph
Massara, G. P., Di Matteo, T., & Aste, T. (2016).
Network filtering for big data: Triangulated maximally filtered graph.
Journal of Complex Networks, 5, 161-178.
Examples
# TMFG filtered network
TMFG(wmt2[,7:24])
# Partial correlations using the LoGo method
TMFG(wmt2[,7:24], partial = TRUE)
Unique Variable Analysis
Description
Identifies locally dependent (redundant) variables in a
multivariate dataset using the EBICglasso.qgraph
network estimation method and weighted topological overlap
(see Christensen, Garrido, & Golino, 2023 for more details)
Usage
UVA(
data = NULL,
network = NULL,
n = NULL,
key = NULL,
uva.method = c("MBR", "EJP"),
cut.off = 0.25,
reduce = TRUE,
reduce.method = c("latent", "mean", "remove", "sum"),
auto = TRUE,
verbose = FALSE,
...
)
Arguments
data |
Matrix or data frame.
Should consist only of variables to be used in the analysis.
Can be raw data or a correlation matrix.
Defaults to |
network |
Symmetric matrix or data frame.
A symmetric network.
Defaults to If both |
n |
Numeric (length = 1).
Sample size if |
key |
Character vector (length = |
uva.method |
Character (length = 1).
Whether the method described in Christensen, Garrido, and
Golino (2023) publication in Multivariate Behavioral Research
( Based on simulation and accumulating empirical evidence, the methods described in Christensen, Golino, and Silvia (2020) such as adaptive alpha are outdated. Evidence supports using a single cut-off value (regardless of continuous, polytomous, or dichotomous data; Christensen, Garrido, & Golino, 2023) |
cut.off |
Numeric (length = 1).
Cut-off used to determine when pairwise This cut-off value is recommended and based on extensive simulation
(Christensen, Garrido, & Golino, 2023). Printing the result will
provide a gradient of pairwise redundancies in increments of 0.20,
0.25, and 0.30. Use |
reduce |
Logical (length = 1).
Whether redundancies should be reduced in data.
Defaults to |
reduce.method |
Character (length = 1). Method to reduce redundancies. Available options:
|
auto |
Logical (length = 1).
Whether
|
verbose |
Boolean (length = 1).
Whether messages and (insignificant) warnings should be output.
Defaults to |
... |
Additional arguments that should be passed on to
old versions of |
References
Most recent simulation and implementation
Christensen, A. P., Garrido, L. E., & Golino, H. (2023).
Unique variable analysis: A network psychometrics method to detect local dependence.
Multivariate Behavioral Research.
Conceptual foundation and outdated methods
Christensen, A. P., Golino, H., & Silvia, P. J. (2020).
A psychometric network perspective on the validity and validation of personality trait questionnaires.
European Journal of Personality, 34(6), 1095-1108.
Weighted topological overlap
Nowick, K., Gernat, T., Almaas, E., & Stubbs, L. (2009).
Differences in human and chimpanzee gene expression patterns define an evolving network of transcription factors in brain.
Proceedings of the National Academy of Sciences, 106, 22358-22363.
Selection of CFA Estimator
Rhemtulla, M., Brosseau-Liard, P. E., & Savalei, V. (2012).
When can categorical variables be treated as continuous? A comparison of robust continuous and categorical SEM estimation methods under suboptimal conditions.
Psychological Methods, 17(3), 354-373.
Examples
# Perform UVA
uva.wmt <- UVA(wmt2[,7:24])
# Show summary
summary(uva.wmt)
Automatic correlations
Description
This wrapper is similar to cor_auto
. There
are some minor adjustments that make this function simpler and to
function within EGAnet
. NA
values are not treated
as categories (this behavior differs from cor_auto
)
Usage
auto.correlate(
data,
corr = c("cosine", "kendall", "pearson", "spearman"),
ordinal.categories = 7,
forcePD = TRUE,
na.data = c("pairwise", "listwise"),
empty.method = c("none", "zero", "all"),
empty.value = c("none", "point_five", "one_over"),
verbose = FALSE,
...
)
Arguments
data |
Matrix or data frame. Should consist only of variables to be used in the analysis |
corr |
Character (length = 1).
The standard correlation method to be used.
Defaults to |
ordinal.categories |
Numeric (length = 1).
Up to the number of categories before a variable is considered continuous.
Defaults to |
forcePD |
Boolean (length = 1).
Whether positive definite matrix should be enforced.
Defaults to |
na.data |
Character (length = 1).
How should missing data be handled?
Defaults to
|
empty.method |
Character (length = 1).
Method for empty cell correction in
|
empty.value |
Character (length = 1).
Value to add to the joint frequency table cells in
|
verbose |
Boolean (length = 1).
Whether messages should be printed.
Defaults to |
... |
Not actually used but makes it easier for general functionality in the package |
Author(s)
Alexander P. Christensen <alexpaulchristensen@gmail.com>
Examples
# Load data
wmt <- wmt2[,7:24]
# Obtain correlations
wmt_corr <- auto.correlate(wmt)
Bootstrap Test for the Ergodicity Information Index
Description
Tests the Ergodicity Information Index obtained in the empirical sample with a distribution of EII obtained by a variant of bootstrap sampling (see Details for the procedure)
Usage
boot.ergoInfo(
dynEGA.object,
EII,
use = c("edge.list", "unweighted", "weighted"),
shuffles = 5000,
iter = 100,
ncores,
verbose = TRUE
)
Arguments
dynEGA.object |
A |
EII |
A |
use |
Character (length = 1).
A string indicating what network element will be used
to compute the algorithm complexity, the list of edges or the weights of the network.
Defaults to
|
shuffles |
Numeric.
Number of shuffles used to compute the Kolmogorov complexity.
Defaults to |
iter |
Numeric (length = 1).
Number of replica samples to generate from the bootstrap analysis.
Defaults to |
ncores |
Numeric (length = 1).
Number of cores to use in computing results.
Defaults to If you're unsure how many cores your computer has,
then type: |
verbose |
Boolean (length = 1).
Should progress be displayed?
Defaults to |
Details
In traditional bootstrap sampling, individual participants are resampled
with replacement from the empirical sample. This process is time consuming
when carried out across v number of variables, n number of
participants, t number of time points, and i number of iterations.
Instead, boot.ergoInfo
uses the premise of an ergodic process to
establish more efficient test that works directly on the sample's networks.
With an ergodic process, the expectation is that all individuals will have
a systematic relationship with the population. Destroying this relationship
should result in a significant loss of information. Following this conjecture,
boot.ergoInfo
shuffles a random subset of edges that exist in the
population that is equal to the number of shared edges
it has with an individual. An individual's unique edges remain the same,
controlling for their unique information. The result is a replicate individual
that contains the same total number of edges as the actual individual but
its shared information with the population has been scrambled.
This process is repeated over each individual to create a replicate sample and is repeated for X iterations (e.g., 100). This approach creates a sampling distribution that represents the expected information between the population and individuals when a random process generates the shared information between them. If the shared information between the population and individuals in the empirical sample is sufficiently meaningful, then this process should result in significant information loss.
How to interpret the results: the result of boot.ergoInfo
is a sampling
distribution of EII values that would be expected if the process was random
(null distribution). If the empirical EII value is greater than or
not significantly different from the null distribution, then the empirical
data can be expected to be generated from an nonergodic process and the
population structure is not sufficient to describe all individuals. If the
empirical EII value is significantly lower than the null distribution,
then the empirical data can be described by the population structure – the
population structure is sufficient to describe all individuals.
Value
Returns a list containing:
empirical.ergoInfo |
Empirical Ergodicity Information Index |
boot.ergoInfo |
The values of the Ergodicity Information Index obtained in the bootstrap |
p.value |
The two-sided p-value of the bootstrap test for the Ergodicity Information Index. The null hypothesis is that the empirical Ergodicity Information index is equal to or greater than the expected value of the EII with small variation in the population structure |
effect |
Indicates wheter the empirical EII is greater or less then the bootstrap distribution of EII. |
interpretation |
How you can interpret the result of the test in plain English |
Author(s)
Hudson Golino <hfg9s at virginia.edu> & Alexander P. Christensen <alexander.christensen at Vanderbilt.Edu>
References
Original Implementation
Golino, H., Nesselroade, J. R., & Christensen, A. P. (2022).
Toward a psychology of individuals: The ergodicity information index and a bottom-up approach for finding generalizations.
PsyArXiv.
See Also
plot.EGAnet
for plot usage in EGAnet
Examples
# Obtain simulated data
sim.data <- sim.dynEGA
## Not run:
# Dynamic EGA individual and population structures
dyn1 <- dynEGA.ind.pop(
data = sim.dynEGA[,-26], n.embed = 5, tau = 1,
delta = 1, id = 25, use.derivatives = 1,
model = "glasso", ncores = 2, corr = "pearson"
)
# Empirical Ergodicity Information Index
eii1 <- ergoInfo(dynEGA.object = dyn1, use = "unweighted")
# Bootstrap Test for Ergodicity Information Index
testing.ergoinfo <- boot.ergoInfo(
dynEGA.object = dyn1, EII = eii1,
ncores = 2, use = "unweighted"
)
# Plot result
plot(testing.ergoinfo)
# Example using `dynEGA`
dyn2 <- dynEGA(
data = sim.dynEGA, n.embed = 5, tau = 1,
delta = 1, use.derivatives = 1, ncores = 2,
level = c("individual", "population")
)
# Empirical Ergodicity Information Index
eii2 <- ergoInfo(dynEGA.object = dyn2, use = "unweighted")
# Bootstrap Test for Ergodicity Information Index
testing.ergoinfo2 <- boot.ergoInfo(
dynEGA.object = dyn2, EII = eii2,
ncores = 2
)
# Plot result
plot(testing.ergoinfo2)
## End(Not run)
bootEGA
Results of wmt2
Data
Description
bootEGA
results from boot.wmt <- bootEGA(wmt2[,7:24], seed = 1234)
Usage
data(boot.wmt)
Format
A list with 12 objects (see Value in bootEGA
)
Examples
data("boot.wmt")
Bootstrap Exploratory Graph Analysis
Description
bootEGA
Estimates the number of dimensions of iter
bootstraps
using the empirical zero-order correlation matrix ("parametric"
) or
"resampling"
from the empirical dataset (non-parametric). bootEGA
estimates a typical median network structure, which is formed by the median or
mean pairwise (partial) correlations over the iter bootstraps (see
Details for information about the typical median network structure).
Usage
bootEGA(
data,
n = NULL,
corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
na.data = c("pairwise", "listwise"),
model = c("BGGM", "glasso", "TMFG"),
algorithm = c("leiden", "louvain", "walktrap"),
uni.method = c("expand", "LE", "louvain"),
iter = 500,
type = c("parametric", "resampling"),
ncores,
EGA.type = c("EGA", "EGA.fit", "hierEGA", "riEGA"),
plot.itemStability = TRUE,
typicalStructure = FALSE,
plot.typicalStructure = FALSE,
seed = NULL,
verbose = TRUE,
...
)
Arguments
data |
Matrix or data frame. Should consist only of variables to be used in the analysis |
n |
Numeric (length = 1).
Sample size if |
corr |
Character (length = 1).
Method to compute correlations.
Defaults to
For other similarity measures, compute them first and input them
into |
na.data |
Character (length = 1).
How should missing data be handled?
Defaults to
|
model |
Character (length = 1).
Defaults to
|
algorithm |
Character or
|
uni.method |
Character (length = 1).
What unidimensionality method should be used?
Defaults to
|
iter |
Numeric (length = 1).
Number of replica samples to generate from the bootstrap analysis.
Defaults to |
type |
Character (length = 1).
What type of bootstrap should be performed?
Defaults to
|
ncores |
Numeric (length = 1).
Number of cores to use in computing results.
Defaults to If you're unsure how many cores your computer has,
then type: |
EGA.type |
Character (length = 1).
Type of EGA model to use.
Defaults to
Arguments for |
plot.itemStability |
Boolean (length = 1).
Should the plot be produced for |
typicalStructure |
Boolean (length = 1).
If |
plot.typicalStructure |
Boolean (length = 1).
If |
seed |
Numeric (length = 1).
Defaults to |
verbose |
Boolean (length = 1).
Should progress be displayed?
Defaults to |
... |
Additional arguments that can be passed on to
|
Details
The typical network structure is derived from the median (or mean) value of each pairwise relationship. These values tend to reflect the "typical" value taken by an edge across the bootstrap networks. Afterward, the same community detection algorithm is applied to the typical network as the bootstrap networks.
Because the community detection algorithm is applied to the typical network structure,
there is a possibility that the community algorithm determines
a different number of dimensions than the median number derived from the bootstraps.
The typical network structure (and number of dimensions) may not
match the empirical EGA
number of dimensions or
the median number of dimensions from the bootstrap. This result is known
and not a bug.
Value
Returns a list containing:
iter |
Number of replica samples in bootstrap |
bootGraphs |
A list containing the networks of each replica sample |
bootCorrs |
A list containing the zero-order correlations of each replica sample |
boot.wc |
A matrix of membership assignments for each replica network with variables down the columns and replicas across the rows |
boot.ndim |
Number of dimensions identified in each replica sample |
summary.table |
A data frame containing number of replica samples, median, standard deviation, standard error, 95% confidence intervals, and quantiles (lower = 2.5% and upper = 97.5%) |
frequency |
A data frame containing the proportion of times the number of dimensions was identified (e.g., .85 of 1,000 = 850 times that specific number of dimensions was found) |
TEFI |
|
type |
Type of bootstrap used |
EGA |
Output of the empirical EGA results
(output will vary based on |
EGA.type |
Type of |
typicalGraph |
A list containing:
|
plot.typical.ega |
Plot output if |
Author(s)
Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
References
Original implementation of bootEGA
Christensen, A. P., & Golino, H. (2021).
Estimating the stability of the number of factors via Bootstrap Exploratory Graph Analysis: A tutorial.
Psych, 3(3), 479-500.
See Also
itemStability
to estimate the stability of
the variables in the empirical dimensions and
dimensionStability
to estimate the stability of
the dimensions (structural consistency)
Examples
# Load data
wmt <- wmt2[,7:24]
## Not run:
# Standard EGA parametric example
boot.wmt <- bootEGA(
data = wmt, iter = 500,
type = "parametric", ncores = 2
)
# Standard resampling example
boot.wmt <- bootEGA(
data = wmt, iter = 500,
type = "resampling", ncores = 2
)
# Example using {igraph} `cluster_*` function
boot.wmt.spinglass <- bootEGA(
data = wmt, iter = 500,
algorithm = igraph::cluster_spinglass,
# use any function from {igraph}
type = "parametric", ncores = 2
)
# EGA fit example
boot.wmt.fit <- bootEGA(
data = wmt, iter = 500,
EGA.type = "EGA.fit",
type = "parametric", ncores = 2
)
# Hierarchical EGA example
boot.wmt.hier <- bootEGA(
data = wmt, iter = 500,
EGA.type = "hierEGA",
type = "parametric", ncores = 2
)
# Random-intercept EGA example
boot.wmt.ri <- bootEGA(
data = wmt, iter = 500,
EGA.type = "riEGA",
type = "parametric", ncores = 2
)
## End(Not run)
EGA
Color Palettes
Description
Color palettes for plotting ggnet2
EGA
network plots
Usage
color_palette_EGA(
name = c("polychrome", "blue.ridge1", "blue.ridge2", "rainbow", "rio", "itacare",
"grayscale"),
wc,
sorted = FALSE
)
Arguments
name |
Character.
Name of color scheme (see
For custom colors, enter HEX codes for each dimension in a vector |
wc |
Numeric vector.
A vector representing the community (dimension) membership
of each node in the network. |
sorted |
Boolean.
Should colors be sorted by |
Value
Vector of colors for community memberships
Author(s)
Hudson Golino <hfg9s at virginia.edu>, Alexander P. Christensen <alexpaulchristensen at gmail.com>
See Also
plot.EGAnet
for plot usage in EGAnet
Examples
# Default
color_palette_EGA(name = "polychrome", wc = ega.wmt$wc)
# Blue Ridge Moutains 1
color_palette_EGA(name = "blue.ridge1", wc = ega.wmt$wc)
# Custom
color_palette_EGA(name = c("#7FD1B9", "#24547e"), wc = ega.wmt$wc)
Compares Community Detection Solutions Using Permutation
Description
A permutation implementation to determine statistical significance of whether the community comparison measure is different from zero
Usage
community.compare(
base,
comparison,
method = c("vi", "nmi", "split.join", "rand", "adjusted.rand"),
iter = 1000,
shuffle.base = TRUE,
verbose = TRUE,
seed = NULL
)
Arguments
base |
Character or numeric vector. A vector of characters or numbers that are treated as the baseline communities |
comparison |
Character or numeric vector (length = |
method |
Character (length = 1).
Comparison metrics from
|
iter |
Numeric (length = 1).
Number of permutations to perform.
Defaults to |
shuffle.base |
Boolean (length = 1).
Whether the |
verbose |
Boolean (length = 1).
Should progress be displayed?
Defaults to |
seed |
Numeric (length = 1).
Defaults to |
Value
Returns data frame containing method used (Method
), empirical or observed
value (Empirical
), and p-value based on the permutation test (p.value
)
Author(s)
Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
References
Implementation of Permutation Test
Qannari, E. M., Courcoux, P., & Faye, P. (2014).
Significance test of the adjusted Rand index. Application to the free sorting task.
Food Quality and Preference, 32, 93–97.
Variation of Information
Meila, M. (2003, August).
Comparing clusterings by the variation of information.
In Learning Theory and Kernel Machines: 16th Annual Conference on Learning Theory and 7th Kernel Workshop,
COLT/Kernel 2003, Washington, DC, USA, August 24-27, 2003. Proceedings (pp. 173-187). Berlin, DE: Springer Berlin Heidelberg.
Normalized Mutual Information
Danon, L., Diaz-Guilera, A., Duch, J., & Arenas, A. (2005).
Comparing community structure identification.
Journal of Statistical Mechanics: Theory and Experiment, 2005(09), P09008.
Split-join Distance
Dongen, S. (2000).
Performance criteria for graph clustering and Markov cluster experiments.
CWI (Centre for Mathematics and Computer Science).
Rand Index
Rand, W. M. (1971).
Objective criteria for the evaluation of clustering methods.
Journal of the American Statistical Association, 66(336), 846-850.
Adjusted Rand Index
Hubert, L., & Arabie, P. (1985).
Comparing partitions.
Journal of Classification, 2, 193-218.
Steinley, D. (2004). Properties of the Hubert-Arabie adjusted rand index. Psychological Methods, 9(3), 386.
Examples
# Load data
wmt <- wmt2[,7:24]
# Estimate network
network <- EBICglasso.qgraph(data = wmt)
# Compute Edge Betweenness
edge_between <- community.detection(network, algorithm = "edge_betweenness")
# Compute Fast Greedy
fast_greedy <- community.detection(network, algorithm = "fast_greedy")
# Perform permutation test
community.compare(edge_between, fast_greedy)
Applies the Consensus Clustering Method (Louvain only)
Description
Applies the consensus clustering method introduced by (Lancichinetti & Fortunato, 2012). The original implementation of this method applies a community detection algorithm repeatedly to the same network. With stochastic networks, the algorithm is likely to identify different community solutions with many repeated applications.
Usage
community.consensus(
network,
order = c("lower", "higher"),
resolution = 1,
consensus.method = c("highest_modularity", "iterative", "most_common", "lowest_tefi"),
consensus.iter = 1000,
correlation.matrix = NULL,
allow.singleton = FALSE,
membership.only = TRUE,
...
)
Arguments
network |
Matrix or |
order |
Character (length = 1).
Defaults to |
resolution |
Numeric (length = 1).
A parameter that adjusts modularity to allow the algorithm to
prefer smaller ( |
consensus.method |
Character (length = 1).
Defaults to
|
consensus.iter |
Numeric (length = 1).
Number of algorithm applications to the network.
Defaults to |
correlation.matrix |
Symmetric matrix.
Used for computation of |
allow.singleton |
Boolean (length = 1).
Whether singleton or single node communities should be allowed.
Defaults to |
membership.only |
Boolean.
Whether the memberships only should be output.
Defaults to |
... |
Not actually used but makes it easier for general functionality in the package |
Details
The goal of the consensus clustering method is to identify a stable solution across algorithm applications to derive a "consensus" clustering. The standard or "iterative" approach is to apply the community detection algorithm N times. Then, a co-occurrence matrix is created representing how often each pair of nodes co-occurred across the applications. Based on some cut-off value (e.g., 0.30), co-occurrences below this value are set to zero, forming a "new" sparse network. The procedure proceeds until all nodes co-occur with all other nodes in their community (or a proportion of 1.00).
Variations of this procedure are also available in this package but are experimental. Use these experimental procedures with caution. More work is necessary before these experimental procedures are validated
At this time, seed setting for consensus clustering is not supported
Value
Returns either a vector with the selected solution
or a list when membership.only = FALSE
:
selected_solution |
Resulting solution from the consensus method |
memberships |
Matrix of memberships across the consensus iterations |
proportion_table |
For methods that use frequency, a table that reports those frequencies alongside their corresponding memberships |
Author(s)
Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
References
Louvain algorithm
Blondel, V. D., Guillaume, J.-L., Lambiotte, R., & Lefebvre, E. (2008).
Fast unfolding of communities in large networks.
Journal of Statistical Mechanics: Theory and Experiment, 2008(10), P10008.
Consensus clustering
Lancichinetti, A., & Fortunato, S. (2012).
Consensus clustering in complex networks.
Scientific Reports, 2(1), 1–7.
Entropy fit indices
Golino, H., Moulder, R. G., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., Nesselroade, J., Sadana, R., Thiyagarajan, J. A., & Boker, S. M. (2020).
Entropy fit indices: New fit measures for assessing the structure and dimensionality of multiple latent variables.
Multivariate Behavioral Research.
Examples
# Load data
wmt <- wmt2[,7:24]
# Estimate correlation matrix
correlation.matrix <- auto.correlate(wmt)
# Estimate network
network <- EBICglasso.qgraph(data = wmt)
# Compute standard Louvain with highest modularity approach
community.consensus(
network,
consensus.method = "highest_modularity"
)
# Compute standard Louvain with iterative (original) approach
community.consensus(
network,
consensus.method = "iterative"
)
# Compute standard Louvain with most common approach
community.consensus(
network,
consensus.method = "most_common"
)
# Compute standard Louvain with lowest TEFI approach
community.consensus(
network,
consensus.method = "lowest_tefi",
correlation.matrix = correlation.matrix
)
Apply a Community Detection Algorithm
Description
General function to apply community detection algorithms available in
igraph
. Follows the EGAnet
approach of setting
singleton and disconnected nodes to missing (NA
)
Usage
community.detection(
network,
algorithm = c("edge_betweenness", "fast_greedy", "fluid", "infomap", "label_prop",
"leading_eigen", "leiden", "louvain", "optimal", "spinglass", "walktrap"),
allow.singleton = FALSE,
membership.only = TRUE,
...
)
Arguments
network |
Matrix or |
algorithm |
Character or
|
allow.singleton |
Boolean (length = 1).
Whether singleton or single node communities should be allowed.
Defaults to |
membership.only |
Boolean (length = 1).
Whether the memberships only should be output.
Defaults to |
... |
Additional arguments to be passed on to
|
Value
Returns memberships from a community detection algorithm
Author(s)
Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
References
Csardi, G., & Nepusz, T. (2006). The igraph software package for complex network research. InterJournal, Complex Systems, 1695.
Examples
# Load data
wmt <- wmt2[,7:24]
# Estimate network
network <- EBICglasso.qgraph(data = wmt)
# Compute Edge Betweenness
community.detection(network, algorithm = "edge_betweenness")
# Compute Fast Greedy
community.detection(network, algorithm = "fast_greedy")
# Compute Fluid
community.detection(
network, algorithm = "fluid",
no.of.communities = 2 # needs to be set
)
# Compute Infomap
community.detection(network, algorithm = "infomap")
# Compute Label Propagation
community.detection(network, algorithm = "label_prop")
# Compute Leading Eigenvector
community.detection(network, algorithm = "leading_eigen")
# Compute Leiden (with modularity)
community.detection(
network, algorithm = "leiden",
objective_function = "modularity"
)
# Compute Leiden (with CPM)
community.detection(
network, algorithm = "leiden",
objective_function = "CPM",
resolution_parameter = 0.05 # "edge density"
)
# Compute Louvain
community.detection(network, algorithm = "louvain")
# Compute Optimal (identifies maximum modularity solution)
community.detection(network, algorithm = "optimal")
# Compute Spinglass
community.detection(network, algorithm = "spinglass")
# Compute Walktrap
community.detection(network, algorithm = "walktrap")
# Example with {igraph} network
community.detection(
convert2igraph(network), algorithm = "walktrap"
)
Homogenize Community Memberships
Description
Memberships from community detection algorithms do not always
align numerically. This function seeks to homogenize
community memberships between a target membership (the
membership to homogenize toward) and one or more other
memberships. This function is the core of the
dimensionStability
and
itemStability
functions
Usage
community.homogenize(target.membership, convert.membership)
Arguments
target.membership |
Vector, matrix, or data frame.
The target memberships that all other memberships input into
|
convert.membership |
Vector, matrix, or data frame.
Either a vector of memberships the same length as
|
Value
Returns a vector or matrix the length or size of
convert.membership
with memberships homogenized toward
target.membership
Author(s)
Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
References
Original implementation of bootEGA
Christensen, A. P., & Golino, H. (2021).
Estimating the stability of the number of factors via Bootstrap Exploratory Graph Analysis: A tutorial.
Psych, 3(3), 479-500.
Examples
# Get network
network <- network.estimation(wmt2[,7:24])
# Apply Walktrap
network_walktrap <- community.detection(
network, algorithm = "walktrap"
)
# Apply Louvain
network_louvain <- community.detection(
network, algorithm = "louvain"
)
# Homogenize toward Walktrap
community.homogenize(network_walktrap, network_louvain)
Approaches to Detect Unidimensional Communities
Description
A function to apply several approaches to detect a unidimensional community in
networks. There have many different approaches recently such as expanding
the correlation matrix to have orthogonal correlations ("expand"
),
applying the Leading Eigenvalue community detection algorithm
cluster_leading_eigen
to the correlation matrix
("LE"
), and applying the Louvain community detection algorithm
cluster_louvain
to the correlation matrix ("louvain"
).
Not necessarily intended for individual use – it's better to use EGA
Usage
community.unidimensional(
data,
n = NULL,
corr = c("auto", "cor_auto", "pearson", "spearman"),
na.data = c("pairwise", "listwise"),
model = c("BGGM", "glasso", "TMFG"),
uni.method = c("expand", "LE", "louvain"),
verbose = FALSE,
...
)
Arguments
data |
Matrix or data frame. Should consist only of variables that are desired to be in analysis |
n |
Numeric (length = 1).
Sample size if |
corr |
Character (length = 1).
Method to compute correlations.
Defaults to
For other similarity measures, compute them first and input them
into |
na.data |
Character (length = 1).
How should missing data be handled?
Defaults to
|
model |
Character (length = 1).
Defaults to
|
uni.method |
Character (length = 1).
What unidimensionality method should be used?
Defaults to
|
verbose |
Boolean.
Whether messages and (insignificant) warnings should be output.
Defaults to |
... |
Additional arguments to be passed on to
|
Value
Returns the memberships of the community detection algorithm. The memberships will output regardless of whether the network is unidimensional
Author(s)
Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
References
Expand approach
Golino, H., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., Sadana, R., Thiyagarajan, J. A., & Martinez-Molina, A. (2020).
Investigating the performance of exploratory graph analysis and traditional techniques to identify the number of latent factors:
A simulation and tutorial.
Psychological Methods, 25, 292-320.
Leading Eigenvector approach
Christensen, A. P., Garrido, L. E., Guerra-Pena, K., & Golino, H. (2023).
Comparing community detection algorithms in psychometric networks: A Monte Carlo simulation.
Behavior Research Methods.
Louvain approach
Christensen, A. P. (2023).
Unidimensional community detection: A Monte Carlo simulation, grid search, and comparison.
PsyArXiv.
Examples
# Load data
wmt <- wmt2[,7:24]
# Louvain with Consensus Clustering (default)
community.unidimensional(wmt)
# Leading Eigenvector
community.unidimensional(wmt, uni.method = "LE")
# Expand
community.unidimensional(wmt, uni.method = "expand")
Visually Compare Two or More EGAnet
plots
Description
Organizes EGA plots for comparison. Ensures that nodes are placed in the same layout to maximize comparison
Usage
compare.EGA.plots(
...,
input.list = NULL,
base = 1,
labels = NULL,
rows = NULL,
columns = NULL,
plot.all = TRUE
)
Arguments
... |
Handles multiple arguments:
|
input.list |
List.
Bypasses |
base |
Numeric (length = 1).
Plot to be used as the base for the configuration of the networks.
Uses the number of the order in which the plots are input.
Defaults to |
labels |
Character (same length as input).
Labels for each |
rows |
Numeric (length = 1). Number of rows to spread plots across |
columns |
Numeric (length = 1). Number of columns to spread plots down |
plot.all |
Boolean (length = 1).
Whether plot should be produced or just output.
Defaults to |
Value
Visual comparison of EGAnet
objects
Author(s)
Alexander Christensen <alexpaulchristensen@gmail.com>
See Also
plot.EGAnet
for plot usage in EGAnet
Examples
# Obtain WMT-2 data
wmt <- wmt2[,7:24]
# Draw random samples of 300 cases
sample1 <- wmt[sample(1:nrow(wmt), 300),]
sample2 <- wmt[sample(1:nrow(wmt), 300),]
# Estimate EGAs
ega1 <- EGA(sample1)
ega2 <- EGA(sample2)
# Compare EGAs via plot
compare.EGA.plots(
ega1, ega2,
base = 1, # use "ega1" as base for comparison
labels = c("Sample 1", "Sample 2"),
rows = 1, columns = 2
)
# Change layout to circle plots
compare.EGA.plots(
ega1, ega2,
labels = c("Sample 1", "Sample 2"),
mode = "circle"
)
Convert networks to igraph
Description
Converts networks to igraph
format
Usage
convert2igraph(A, diagonal = 0)
Arguments
A |
Matrix or data frame. N x N matrix where N is the number of nodes |
diagonal |
Numeric.
Value to be placed on the diagonal of |
Value
Returns a network in the igraph
format
Author(s)
Hudson Golino <hfg9s at virginia.edu> & Alexander P. Christensen <alexander.christensen at Vanderbilt.Edu>
Examples
convert2igraph(ega.wmt$network)
Convert networks to tidygraph
Description
Converts networks to tidygraph
format
Usage
convert2tidygraph(EGA.object)
Arguments
EGA.object |
A single |
Value
Returns a network in the tidygraph
format
Author(s)
Dominique Makowski, Hudson Golino <hfg9s at virginia.edu>, & Alexander P. Christensen <alexander.christensen at Vanderbilt.Edu>
Examples
convert2tidygraph(ega.wmt)
Cosine similarity
Description
Computes cosine similarity
Usage
cosine(x, y = NULL, ...)
Arguments
x |
Numeric vector, matrix, or data frame.
If |
y |
Numeric vector, matrix, or data frame.
Only used if |
... |
Not actually used but makes it easier for general functionality in the package |
Details
On missing values: 0
will be used to replace missing values.
When using (matrix) multiplication, the 0
value cancels out the
product rendering the missing value as "not counting" in the sums
Author(s)
Alexander P. Christensen <alexpaulchristensen@gmail.com>
Examples
# Load data
wmt <- wmt2[,7:24]
# Obtain cosines
wmt_cosine <- cosine(wmt)
Depression Data
Description
A response matrix (n = 574) of the Beck Depression Inventory, Beck Anxiety Inventory, and the Athens Insomnia Scale.
Usage
data(depression)
Format
A 574x78 response matrix
Examples
data("depression")
Dimension Stability Statistics from bootEGA
Description
Based on the bootEGA
results,
this function computes the stability of dimensions. Stability is
computed by assessing the proportion of times the
original dimension is exactly replicated in across bootstrap samples
Usage
dimensionStability(bootega.obj, IS.plot = TRUE, structure = NULL, ...)
Arguments
bootega.obj |
A |
IS.plot |
Boolean (length = 1).
Should the plot be produced for |
structure |
Numeric (length = number of variables).
A theoretical or pre-defined structure.
Defaults to |
... |
Additional arguments.
Used for deprecated arguments from previous versions of |
Value
Returns a list containing:
dimension.stability |
A list containing: |
item.stability |
Results from |
Author(s)
Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
References
Original implementation of bootEGA
Christensen, A. P., & Golino, H. (2021).
Estimating the stability of the number of factors via Bootstrap Exploratory Graph Analysis: A tutorial.
Psych, 3(3), 479-500.
Conceptual introduction
Christensen, A. P., Golino, H., & Silvia, P. J. (2020).
A psychometric network perspective on the validity and validation of personality trait questionnaires.
European Journal of Personality, 34(6), 1095-1108.
Examples
# Load data
wmt <- wmt2[,7:24]
## Not run:
# Estimate bootstrap EGA
boot.wmt <- bootEGA(
data = wmt, iter = 500,
type = "parametric", ncores = 2
)
## End(Not run)
# Estimate stability statistics
dimensionStability(boot.wmt)
Loadings Comparison Test Deep Learning Neural Network Weights
Description
A list of weights from four different neural network models:
random vs. non-random model (r_nr_weights
),
low correlation factor vs. network model (lf_n_weights
),
high correlation with variables less than or equal to factors vs. network model (hlf_n_weights
), and
high correlation with variables greater than factors vs. network model (hgf_n_weights
)
Usage
data(dnn.weights)
Format
A list of with a length of 4
Examples
data("dnn.weights")
Dynamic Exploratory Graph Analysis
Description
Estimates dynamic communities in multivariate time series (e.g., panel data, longitudinal data, intensive longitudinal data) at multiple time scales and at different levels of analysis: individuals (intraindividual structure), groups, and population (interindividual structure)
Usage
dynEGA(
data,
id = NULL,
group = NULL,
n.embed = 5,
tau = 1,
delta = 1,
use.derivatives = 1,
level = c("individual", "group", "population"),
corr = c("auto", "cor_auto", "pearson", "spearman"),
na.data = c("pairwise", "listwise"),
model = c("BGGM", "glasso", "TMFG"),
algorithm = c("leiden", "louvain", "walktrap"),
uni.method = c("expand", "LE", "louvain"),
ncores,
verbose = TRUE,
...
)
Arguments
data |
Matrix or data frame. Participants and variable should be in long format such that row t represents observations for all variables at time point t for a participant. The next row, t + 1, represents the next measurement occasion for that same participant. The next participant's data should immediately follow, in the same pattern, after the previous participant
For groups, Arguments A measurement occasion variable is not necessary and should be removed from the data before proceeding with the analysis |
id |
Numeric or character (length = 1).
Number or name of the column identifying each individual.
Defaults to |
group |
Numeric or character (length = 1).
Number of the column identifying group membership.
Defaults to |
n.embed |
Numeric (length = 1).
Defaults to |
tau |
Numeric (length = 1).
Defaults to |
delta |
Numeric (length = 1).
Defaults to |
use.derivatives |
Numeric (length = 1).
Defaults to
Generally recommended to leave "as is" |
level |
Character vector (up to length of 3). A character vector indicating which level(s) to estimate: |
corr |
Character (length = 1).
Method to compute correlations.
Defaults to
For other similarity measures, compute them first and input them
into |
na.data |
Character (length = 1).
How should missing data be handled?
Defaults to
|
model |
Character (length = 1).
Defaults to
|
algorithm |
Character or
|
uni.method |
Character (length = 1).
What unidimensionality method should be used?
Defaults to
|
ncores |
Numeric (length = 1).
Number of cores to use in computing results.
Defaults to If you're unsure how many cores your computer has,
then type: |
verbose |
Boolean (length = 1).
Should progress be displayed?
Defaults to |
... |
Additional arguments to be passed on to
|
Details
Derivatives for each variable's time series for each participant are
estimated using generalized local linear approximation (see glla
).
EGA
is then applied to these derivatives to model how variables
are changing together over time. Variables that change together over time are detected
as communities
Value
A list containing:
Derivatives |
A list containing:
|
dynEGA |
A list containing: |
Author(s)
Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
References
Generalized local linear approximation
Boker, S. M., Deboeck, P. R., Edler, C., & Keel, P. K. (2010)
Generalized local linear approximation of derivatives from time series. In S.-M. Chow, E. Ferrer, & F. Hsieh (Eds.),
The Notre Dame series on quantitative methodology. Statistical methods for modeling human dynamics: An interdisciplinary dialogue,
(p. 161-178). Routledge/Taylor & Francis Group.
Deboeck, P. R., Montpetit, M. A., Bergeman, C. S., & Boker, S. M. (2009) Using derivative estimates to describe intraindividual variability at multiple time scales. Psychological Methods, 14(4), 367-386.
Original dynamic EGA implementation
Golino, H., Christensen, A. P., Moulder, R. G., Kim, S., & Boker, S. M. (2021).
Modeling latent topics in social media using Dynamic Exploratory Graph Analysis: The case of the right-wing and left-wing trolls in the 2016 US elections.
Psychometrika.
Time delay embedding procedure
Savitzky, A., & Golay, M. J. (1964).
Smoothing and differentiation of data by simplified least squares procedures.
Analytical Chemistry, 36(8), 1627-1639.
See Also
plot.EGAnet
for plot usage in EGAnet
Examples
# Population structure
simulated_population <- dynEGA(
data = sim.dynEGA, level = "population"
# uses simulated data in package
# useful to understand how data should be structured
)
# Group structure
simulated_group <- dynEGA(
data = sim.dynEGA, level = "group"
# uses simulated data in package
# useful to understand how data should be structured
)
## Not run:
# Individual structure
simulated_individual <- dynEGA(
data = sim.dynEGA, level = "individual",
ncores = 2, # use more for quicker results
verbose = TRUE # progress bar
)
# Population, group, and individual structure
simulated_all <- dynEGA(
data = sim.dynEGA,
level = c("individual", "group", "population"),
ncores = 2, # use more for quicker results
verbose = TRUE # progress bar
)
# Plot population
plot(simulated_all$dynEGA$population)
# Plot groups
plot(simulated_all$dynEGA$group)
# Plot individual
plot(simulated_all$dynEGA$individual, id = 1)
# Step through all plots
# Unless `id` is specified, 4 random IDs
# will be drawn from individuals
plot(simulated_all)
## End(Not run)
Intra- and Inter-individual dynEGA
Description
A wrapper function to estimate both intraindividiual
(level = "individual"
) and interindividual (level = "population"
)
structures using dynEGA
Usage
dynEGA.ind.pop(
data,
id = NULL,
n.embed = 5,
tau = 1,
delta = 1,
use.derivatives = 1,
corr = c("auto", "cor_auto", "pearson", "spearman"),
na.data = c("pairwise", "listwise"),
model = c("BGGM", "glasso", "TMFG"),
algorithm = c("leiden", "louvain", "walktrap"),
uni.method = c("expand", "LE", "louvain"),
ncores,
verbose = TRUE,
...
)
Arguments
data |
Matrix or data frame. Participants and variable should be in long format such that row t represents observations for all variables at time point t for a participant. The next row, t + 1, represents the next measurement occasion for that same participant. The next participant's data should immediately follow, in the same pattern, after the previous participant
For groups, Arguments A measurement occasion variable is not necessary and should be removed from the data before proceeding with the analysis |
id |
Numeric or character (length = 1).
Number or name of the column identifying each individual.
Defaults to |
n.embed |
Numeric (length = 1).
Defaults to |
tau |
Numeric (length = 1).
Defaults to |
delta |
Numeric (length = 1).
Defaults to |
use.derivatives |
Numeric (length = 1).
Defaults to
Generally recommended to leave "as is" |
corr |
Character (length = 1).
Method to compute correlations.
Defaults to
For other similarity measures, compute them first and input them
into |
na.data |
Character (length = 1).
How should missing data be handled?
Defaults to
|
model |
Character (length = 1).
Defaults to
|
algorithm |
Character or
|
uni.method |
Character (length = 1).
What unidimensionality method should be used?
Defaults to
|
ncores |
Numeric (length = 1).
Number of cores to use in computing results.
Defaults to If you're unsure how many cores your computer has,
then type: |
verbose |
Boolean (length = 1).
Should progress be displayed?
Defaults to |
... |
Additional arguments to be passed on to
|
Value
Same output as EGAnet{dynEGA}
returning list
objects for level = "individual"
and level = "population"
Author(s)
Hudson Golino <hfg9s at virginia.edu>
See Also
plot.EGAnet
for plot usage in EGAnet
Examples
# Obtain data
sim.dynEGA <- sim.dynEGA # bypasses CRAN checks
## Not run:
# Dynamic EGA individual and population structure
dyn.ega1 <- dynEGA.ind.pop(
data = sim.dynEGA, n.embed = 5, tau = 1,
delta = 1, id = 25, use.derivatives = 1,
ncores = 2, corr = "pearson"
)
## End(Not run)
EGA
Network of wmt2
Data
Description
EGA
results from ega.wmt <- EGA(wmt2[,7:24])
for the Wiener Matrizen-Test (WMT-2)
Usage
data(ega.wmt)
Format
A list with 8 objects (see Value in EGA
)
Examples
data("ega.wmt")
Entropy Fit Index
Description
Computes the fit of a dimensionality structure using empirical entropy. Lower values suggest better fit of a structure to the data.
Usage
entropyFit(data, structure)
Arguments
data |
Matrix or data frame. Contains variables to be used in the analysis |
structure |
Numeric or character vector (length = |
Value
Returns a list containing:
Total.Correlation |
The total correlation of the dataset |
Total.Correlation.MM |
Miller-Madow correction for the total correlation of the dataset |
Entropy.Fit |
The Entropy Fit Index |
Entropy.Fit.MM |
Miller-Madow correction for the Entropy Fit Index |
Average.Entropy |
The average entropy of the dataset |
Author(s)
Hudson F. Golino <hfg9s at virginia.edu>, Alexander P. Christensen <alexpaulchristensen@gmail.com> and Robert Moulder <rgm4fd@virginia.edu>
References
Initial formalization and simulation
Golino, H., Moulder, R. G., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., Nesselroade, J., Sadana, R., Thiyagarajan, J. A., & Boker, S. M. (2020).
Entropy fit indices: New fit measures for assessing the structure and dimensionality of multiple latent variables.
Multivariate Behavioral Research.
Examples
# Load data
wmt <- wmt2[,7:24]
## Not run:
# Estimate EGA model
ega.wmt <- EGA(data = wmt)
## End(Not run)
# Compute entropy indices
entropyFit(data = wmt, structure = ega.wmt$wc)
Ergodicity Information Index
Description
Computes the Ergodicity Information Index
Usage
ergoInfo(
dynEGA.object,
use = c("edge.list", "unweighted", "weighted"),
shuffles = 5000
)
Arguments
dynEGA.object |
A |
use |
Character (length = 1).
A string indicating what network element will be used
to compute the algorithm complexity, the list of edges or the weights of the network.
Defaults to
|
shuffles |
Numeric.
Number of shuffles used to compute the Kolmogorov complexity.
Defaults to |
Value
Returns a list containing:
PrimeWeight |
The prime-weight encoding of the individual networks |
PrimeWeight.pop |
The prime-weight encoding of the population network |
Kcomp |
The Kolmogorov complexity of the prime-weight encoded individual networks |
Kcomp.pop |
The Kolmogorov complexity of the prime-weight encoded population network |
complexity |
The complexity metric proposed by Santora and Nicosia (2020) |
EII |
The Ergodicity Information Index |
Author(s)
Hudson Golino <hfg9s at virginia.edu> and Alexander Christensen <alexpaulchristensen@gmail.com>
References
Original Implementation
Golino, H., Nesselroade, J. R., & Christensen, A. P. (2022).
Toward a psychology of individuals: The ergodicity information index and a bottom-up approach for finding generalizations.
PsyArXiv.
Examples
# Obtain data
sim.dynEGA <- sim.dynEGA # bypasses CRAN checks
## Not run:
# Dynamic EGA individual and population structure
dyn.ega1 <- dynEGA.ind.pop(
data = sim.dynEGA[,-26], n.embed = 5, tau = 1,
delta = 1, id = 25, use.derivatives = 1,
ncores = 2, corr = "pearson"
)
# Compute empirical ergodicity information index
eii <- ergoInfo(dyn.ega1)
## End(Not run)
Frobenius Norm (Similarity)
Description
Computes the Frobenius Norm (Ulitzsch et al., 2023)
Usage
frobenius(network1, network2)
Arguments
network1 |
Matrix or data frame. Network to be compared |
network2 |
Matrix or data frame. Second network to be compared |
Value
Returns Frobenius Norm
Author(s)
Hudson Golino <hfg9s at virginia.edu> & Alexander P. Christensen <alexander.christensen at Vanderbilt.Edu>
References
Simulation Study
Ulitzsch, E., Khanna, S., Rhemtulla, M., & Domingue, B. W. (2023).
A graph theory based similarity metric enables comparison of subpopulation psychometric networks
Psychological Methods.
Examples
# Obtain wmt2 data
wmt <- wmt2[,7:24]
# Set seed (for reproducibility)
set.seed(1234)
# Split data
split1 <- sample(
1:nrow(wmt), floor(nrow(wmt) / 2)
)
split2 <- setdiff(1:nrow(wmt), split1)
# Obtain split data
data1 <- wmt[split1,]
data2 <- wmt[split2,]
# Perform EBICglasso
glas1 <- EBICglasso.qgraph(data1)
glas2 <- EBICglasso.qgraph(data2)
# Frobenius norm
frobenius(glas1, glas2)
# 0.7070395
Generalized Total Entropy Fit Index using Von Neumman's entropy (Quantum Information Theory) for correlation matrices
Description
Computes the fit (Generalized TEFI) of a hierarchical or correlated bifactor
dimensionality structure (or hierEGA
objects) using Von Neumman's entropy
when the input is a correlation matrix. Lower values suggest better fit of a structure to the data
Usage
genTEFI(data, structure = NULL, verbose = TRUE)
Arguments
data |
Matrix, data frame, or |
structure |
For high-order and correlated bifactor structures,
|
verbose |
Boolean (length = 1).
Whether messages and (insignificant) warnings should be output.
Defaults to |
Value
Returns a three-column data frame of the Generalized Total Entropy
Fit Index using Von Neumman's entropy (VN.Entropy.Fit
) (first column), as well as
Lower.Order.VN
- TEFI for the first-order factors (second column), and
Higher.Order.VN
, the equivalent for the second-order factors.
Author(s)
Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
Examples
# Example using network scores
opt.hier <- hierEGA(
data = optimism, scores = "network",
plot.EGA = FALSE # No plot for CRAN checks
)
# Compute the Generalized Total Entropy Fit Index
genTEFI(opt.hier)
Generalized Local Linear Approximation
Description
Estimates the derivatives of a time series using generalized local linear approximation (GLLA). GLLA is a filtering method for estimating derivatives from data that uses time delay embedding and a variant of Savitzky-Golay filtering to accomplish the task.
Usage
glla(x, n.embed, tau, delta, order)
Arguments
x |
Numeric vector. An observed time series |
n.embed |
Numeric (length = 1).
Number of embedded dimensions (the number of observations
to be used in the |
tau |
Numeric (length = 1).
Number of observations to offset successive embeddings in
the |
delta |
Numeric (length = 1).
The time between successive observations in the time series.
Default is |
order |
Numeric (length = 1).
The maximum order of the derivative to be estimated. For example,
|
Value
Returns a matrix containing n columns in which n is one plus the maximum order of the derivatives to be estimated via generalized local linear approximation
Author(s)
Hudson Golino <hfg9s at virginia.edu>
References
GLLA implementation
Boker, S. M., Deboeck, P. R., Edler, C., & Keel, P. K. (2010)
Generalized local linear approximation of derivatives from time series. In S.-M. Chow, E. Ferrer, & F. Hsieh (Eds.),
The Notre Dame series on quantitative methodology. Statistical methods for modeling human dynamics: An interdisciplinary dialogue,
(p. 161-178). Routledge/Taylor & Francis Group.
Deboeck, P. R., Montpetit, M. A., Bergeman, C. S., & Boker, S. M. (2009) Using derivative estimates to describe intraindividual variability at multiple time scales. Psychological Methods, 14(4), 367-386.
Filtering procedure
Savitzky, A., & Golay, M. J. (1964).
Smoothing and differentiation of data by simplified least squares procedures.
Analytical Chemistry, 36(8), 1627-1639.
Examples
# A time series with 8 time points
tseries <- 49:56
deriv.tseries <- glla(tseries, n.embed = 4, tau = 1, delta = 1, order = 2)
Hierarchical EGA
Description
Estimates EGA using the lower-order solution of the Louvain
algorithm (cluster_louvain
)to identify the lower-order
dimensions and then uses factor or network loadings to estimate factor
or network scores, which are used to estimate the higher-order dimensions
(for more details, see Jiménez et al., 2023)
Usage
hierEGA(
data,
loading.method = c("original", "revised"),
rotation = NULL,
scores = c("factor", "network"),
loading.structure = c("simple", "full"),
impute = c("mean", "median", "none"),
corr = c("auto", "cor_auto", "pearson", "spearman"),
na.data = c("pairwise", "listwise"),
model = c("BGGM", "glasso", "TMFG"),
lower.algorithm = "louvain",
higher.algorithm = c("leiden", "louvain", "walktrap"),
uni.method = c("expand", "LE", "louvain"),
plot.EGA = TRUE,
verbose = FALSE,
...
)
Arguments
data |
Matrix or data frame. Should consist only of variables to be used in the analysis (does not accept correlation matrices) |
loading.method |
Character (length = 1).
Sets network loading calculation based on implementation
described in |
rotation |
Character.
A rotation to use to obtain a simpler structure.
For a list of rotations, see |
scores |
Character (length = 1).
How should scores for the higher-order structure be estimated?
Defaults to Factor scores use the number of communities from
|
loading.structure |
Character (length = 1).
Whether simple structure or the saturated loading matrix
should be used when computing scores (
Simple structure is the more conservative (established) approach
and is therefore the default. Treat |
impute |
Character (length = 1). If there are any missing data, then imputation can be implemented. Available options:
|
corr |
Character (length = 1).
Method to compute correlations.
Defaults to
For other similarity measures, compute them first and input them
into |
na.data |
Character (length = 1).
How should missing data be handled?
Defaults to
|
model |
Character (length = 1).
Defaults to
|
lower.algorithm |
Character or
Louvain with consensus clustering is strongly recommended. Using any other algorithm is considered experimental as they have not been designed to capture lower order communities |
higher.algorithm |
Character or
Using |
uni.method |
Character (length = 1).
What unidimensionality method should be used?
Defaults to
|
plot.EGA |
Boolean.
If |
verbose |
Boolean (length = 1).
Whether messages and (insignificant) warnings should be output.
Defaults to |
... |
Additional arguments to be passed on to
|
Value
Returns a list of lists containing:
lower_order |
|
higher_order |
|
parameters |
A list containing |
dim.variables |
A data frame with variable names and their lower and higher order assignments |
TEFI |
Generalized TEFI using |
plot.hierEGA |
Plot output if |
Author(s)
Marcos Jiménez <marcosjnezhquez@gmailcom>, Francisco J. Abad <fjose.abad@uam.es>, Eduardo Garcia-Garzon <egarcia@ucjc.edu>, Hudson Golino <hfg9s@virginia.edu>, Alexander P. Christensen <alexpaulchristensen@gmail.com>, and Luis Eduardo Garrido <luisgarrido@pucmm.edu.do>
References
Hierarchical EGA simulation
Jiménez, M., Abad, F. J., Garcia-Garzon, E., Golino, H., Christensen, A. P., & Garrido, L. E. (2023).
Dimensionality assessment in bifactor structures with multiple general factors: A network psychometrics approach.
Psychological Methods.
3+ level hierarchical EGA
Samo, A., Christensen, A. P., Abad, F. J., Garrido, L. E., Garcia-Garzon, E., Golino, H. & McAbee, S. T. (2023). Building the structure of personality from the bottom-up using Hierarchical Exploratory Graph Analysis.
PsyArXiv.
Conceptual implementation
Golino, H., Thiyagarajan, J. A., Sadana, R., Teles, M., Christensen, A. P., & Boker, S. M. (2020).
Investigating the broad domains of intrinsic capacity, functional ability and
environment: An exploratory graph analysis approach for improving analytical
methodologies for measuring healthy aging.
PsyArXiv.
Revised network loadings
Christensen, A. P., Golino, H., Abad, F. J., & Garrido, L. E. (2024).
Revised network loadings.
PsyArXiv.
See Also
plot.EGAnet
for plot usage in
Examples
# Example using network scores
opt.hier <- hierEGA(
data = optimism, scores = "network",
plot.EGA = FALSE # No plot for CRAN checks
)
# Plot multilevel plot
plot(opt.hier, plot.type = "multilevel")
# Plot multilevel plot with higher order
# border color matching the corresponding
# lower order color
plot(opt.hier, color.match = TRUE)
# Plot levels separately
plot(opt.hier, plot.type = "separate")
Convert network to matrix
Description
Converts network to matrix
Usage
igraph2matrix(igraph_network, diagonal = 0)
Arguments
igraph_network |
network object |
diagonal |
Numeric (length = 1).
Value to be placed on the diagonal of |
Value
Returns a network in the format
Author(s)
Hudson Golino <hfg9s at virginia.edu> & Alexander P. Christensen <alexander.christensen at Vanderbilt.Edu>
Examples
# Convert network to {igraph}
igraph_network <- convert2igraph(ega.wmt$network)
# Convert network back to matrix
igraph2matrix(igraph_network)
Information Theoretic Mixture Clustering for dynEGA
Description
Performs hierarchical clustering using Jensen-Shannon distance followed by the Louvain algorithm with consensus clustering. The method iteratively identifies smaller and smaller clusters until there is no change in the clusters identified
Usage
infoCluster(dynEGA.object, plot.cluster = TRUE, ...)
Arguments
dynEGA.object |
A |
plot.cluster |
Boolean (length = 1).
Should plot of optimal and hierarchical clusters be output?
Defaults to |
... |
Additional arguments to be passed on to
|
Value
Returns a list containing:
clusters |
A vector corresponding to cluster each participant belongs to |
clusterTree |
The dendogram from |
clusterPlot |
Plot output from results |
JSD |
Jensen-Shannon Distance |
Author(s)
Hudson Golino <hfg9s at virginia.edu> & Alexander P. Christensen <alexander.christensen at Vanderbilt.Edu>
See Also
plot.EGAnet
for plot usage in EGAnet
Examples
# Obtain data
sim.dynEGA <- sim.dynEGA # bypasses CRAN checks
## Not run:
# Dynamic EGA individual and population structure
dyn.ega1 <- dynEGA.ind.pop(
data = sim.dynEGA, n.embed = 5, tau = 1,
delta = 1, id = 25, use.derivatives = 1,
ncores = 2, corr = "pearson"
)
# Perform information-theoretic clustering
clust1 <- infoCluster(dynEGA.object = dyn.ega1)
## End(Not run)
Information Theory Metrics
Description
A general function to compute several different information theory metrics
Usage
information(
data,
base = 2.718282,
bins = floor(sqrt(nrow(data)/5)),
statistic = c("entropy", "joint.entropy", "conditional.entropy", "total.correlation",
"dual.total.correlation", "o.information")
)
Arguments
data |
Matrix or data frame. Should consist only of variables to be used in the analysis |
base |
Numeric (length = 1). Base of logarithm to use for entropy. Common options include:
Defaults to |
bins |
Numeric (length = 1).
Number of bins if data are not discrete.
Defaults to |
statistic |
Character. Information theory statistics to compute. Available options:
By default, all statistics are computed |
Value
Returns list containing only requested statistic
Author(s)
Hudson F. Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
References
Shannon's entropy
Shannon, C. E. (1948). A mathematical theory of communication.
The Bell System Technical Journal, 27(3), 379-423.
Formalization of total correlation
Watanabe, S. (1960).
Information theoretical analysis of multivariate correlation.
IBM Journal of Research and Development 4, 66-82.
Applied implementation of total correlation
Felix, L. M., Mansur-Alves, M., Teles, M., Jamison, L., & Golino, H. (2021).
Longitudinal impact and effects of booster sessions in a cognitive training program for healthy older adults.
Archives of Gerontology and Geriatrics, 94, 104337.
Formalization of dual total correlation
Te Sun, H. (1978).
Nonnegative entropy measures of multivariate symmetric correlations.
Information and Control, 36, 133-156.
Formalization of O-information
Crutchfield, J. P. (1994). The calculi of emergence: Computation, dynamics and induction.
Physica D: Nonlinear Phenomena, 75(1-3), 11-54.
Applied implementation of O-information
Marinazzo, D., Van Roozendaal, J., Rosas, F. E., Stella, M., Comolatti, R., Colenbier, N., Stramaglia, S., & Rosseel, Y. (2024).
An information-theoretic approach to build hypergraphs in psychometrics.
Behavior Research Methods, 1-23.
Examples
# All measures
information(wmt2[,7:24])
# One measures
information(wmt2[,7:24], statistic = "joint.entropy")
Intelligence Data
Description
A response matrix (n = 1152) of the International Cognitive Ability Resource (ICAR) intelligence battery developed by Condon and Revelle (2016).
Usage
data(intelligenceBattery)
Format
A 1185x125 response matrix
Examples
data("intelligenceBattery")
Measurement Invariance of EGA
Structure
Description
Estimates configural invariance using bootEGA
on all data (across groups) first. After configural variance is established,
then metric invariance is tested using the community structure that established
configural invariance (see Details for more information on this process)
Usage
invariance(
data,
groups,
structure = NULL,
iter = 500,
configural.threshold = 0.7,
configural.type = c("parametric", "resampling"),
corr = c("auto", "cor_auto", "pearson", "spearman"),
na.data = c("pairwise", "listwise"),
model = c("BGGM", "glasso", "TMFG"),
algorithm = c("leiden", "louvain", "walktrap"),
uni.method = c("expand", "LE", "louvain"),
ncores,
seed = NULL,
verbose = TRUE,
...
)
Arguments
data |
Matrix or data frame. Should consist only of variables to be used in the analysis |
groups |
Numeric or character vector (length = |
structure |
Numeric or character vector (length = |
iter |
Numeric (length = 1).
Number of iterations to perform for the permutation.
Defaults to |
configural.threshold |
Numeric (length = 1).
Value to use a threshold in |
configural.type |
Character (length = 1).
Type of bootstrap to use for configural invariance in |
corr |
Character (length = 1).
Method to compute correlations.
Defaults to
For other similarity measures, compute them first and input them
into |
na.data |
Character (length = 1).
How should missing data be handled?
Defaults to
|
model |
Character (length = 1).
Defaults to
|
algorithm |
Character or
|
uni.method |
Character (length = 1).
What unidimensionality method should be used?
Defaults to
|
ncores |
Numeric (length = 1).
Number of cores to use in computing results.
Defaults to If you're unsure how many cores your computer has,
then type: |
seed |
Numeric (length = 1).
Defaults to |
verbose |
Boolean (length = 1).
Should progress be displayed?
Defaults to |
... |
Additional arguments that can be passed on to
|
Details
In traditional psychometrics, measurement invariance is performed in sequential testing from more flexible (more free parameters) to more rigid (fewer free parameters) structures. Measurement invariance in network psychometrics is no different.
Configural Invariance
To establish configural invariance, the data are collapsed across groups
and a common sample structure is identified used bootEGA
and itemStability
. If some variables have a replication
less than 0.70 in their assigned dimension, then they are considered unstable
and therefore not invariant. These variables are removed and this process
is repeated until all items are considered stable (replication values greater
than 0.70) or there are no variables left. If configural invariance cannot be
established, then the last run of results are returned and metric invariance
is not tested (because configural invariance is not met). Importantly, if any
variables are removed, then configural invariance is not met for the
original structure. Any removal would suggest only partial configural invariance
is met.
Metric Invariance
The variables that remain after configural invariance are submitted to metric
invariance. First, each group estimates a network and then network loadings
(net.loads
) are computed using the assigned
community memberships (determined during configural invariance). Then,
the difference between the assigned loadings of the groups is computed. This
difference represents the empirical values. Second, the group memberships
are permutated and networks are estimated based on the these permutated
groups for iter
times. Then, network loadings are computed and
the difference between the assigned loadings of the group is computed, resulting
in a null distribution. The empirical difference is then compared against
the null distribution using a two-tailed p-value based on the number
of null distribution differences that are greater and less than the empirical
differences for each variable. Both uncorrected and false discovery rate
corrected p-values are returned in the results. Uncorrected p-values
are flagged for significance along with the direction of group differences.
Three or More Groups
When there are 3 or more groups, the function performs metric invariance testing by comparing all possible pairs of groups. Specifically:
-
Pairwise Comparisons: The function generates all possible unique group pairings and computes the differences in network loadings for each pair. The same community structure, derived from configural invariance or provided by the user, is used for all groups.
-
Permutation Testing: For each group pair, permutation tests are conducted to assess the statistical significance of the observed differences in loadings. p-values are calculated based on the proportion of permuted differences that are greater than or equal to the observed difference.
-
Result Compilation: The function compiles the results for each pair including both uncorrected (
p
) and FDR-corrected (Benjamini-Hochberg;p_BH
) p-values, and the direction of differences. It returns a summary of the findings for all pairwise comparisons.
This approach allows for a detailed examination of metric invariance across multiple groups, ensuring that all potential differences are thoroughly assessed while maintaining the ability to identify specific group differences.
For more details, see Jamison, Golino, and Christensen (2023)
Value
Returns a list containing:
configural.results |
|
memberships |
Original memberships provided in |
EGA |
Original |
groups |
A list containing: |
permutation |
A list containing:
|
results |
Data frame of the results (which are printed) |
Author(s)
Laura Jamison <lj5yn@virginia.edu>, Hudson F. Golino <hfg9s at virginia.edu>, and Alexander P. Christensen <alexpaulchristensen@gmail.com>,
References
Original implementation
Jamison, L., Christensen, A. P., & Golino, H. F. (2024).
Metric invariance in exploratory graph analysis via permutation testing.
Methodology, 20(2), 144-186.
See Also
plot.EGAnet
for plot usage in
Examples
# Load data
wmt <- wmt2[-1,7:24]
# Groups
groups <- rep(1:2, each = nrow(wmt) / 2)
## Not run:
# Measurement invariance
results <- invariance(wmt, groups, ncores = 2)
# Plot with uncorrected alpha = 0.05
plot(results, p_type = "p", p_value = 0.05)
# Plot with BH-corrected alpha = 0.10
plot(results, p_type = "p_BH", p_value = 0.10)
## End(Not run)
Diagnostics Analysis for Low Stability Items
Description
Computes the between- and within-community
strength
of each variable for each community
Usage
itemDiagnostics(data, ...)
Arguments
data |
Matrix or data frame. Should consist only of variables to be used in the analysis |
... |
Additional arguments to pass on to
|
Value
Returns a list containing:
diagnostics |
A data frame containing the diagnostics of low item stabilities
(see |
boot |
Output from |
uva |
Output from |
minor |
A list containing suggested items to |
loadings |
Output from |
suggested |
Variables that are suggested to be retained to increase item stability |
Author(s)
Alexander P. Christensen <alexpaulchristensen@gmail.com>, Hudson Golino <hfg9s at virginia.edu>, and Luis Eduardo Garrido <garrido.luiseduardo@gmail.com>
Examples
# Load data
wmt <- wmt2[,7:24]
## Not run:
# Obtain diagnostics
diagnostics <- itemDiagnostics(wmt, ncores = 2)
## End(Not run)
Item Stability Statistics from bootEGA
Description
Based on the bootEGA
results, this function
computes and plots the number of times an variable is estimated
in the same dimension as originally estimated by an empirical
EGA
structure or a theoretical/input structure.
The output also contains each variable's replication frequency (i.e., proportion of
bootstraps that a variable appeared in each dimension
Usage
itemStability(bootega.obj, IS.plot = TRUE, structure = NULL, ...)
Arguments
bootega.obj |
A |
IS.plot |
Boolean (length = 1).
Should the plot be produced for |
structure |
Numeric (length = number of variables).
A theoretical or pre-defined structure.
Defaults to |
... |
Deprecated arguments from previous versions of |
Value
Returns a list containing:
membership |
A list containing:
|
item.stability |
A list containing:
|
plot |
Plot output if |
Author(s)
Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
References
Original implementation of bootEGA
Christensen, A. P., & Golino, H. (2021).
Estimating the stability of the number of factors via Bootstrap Exploratory Graph Analysis: A tutorial.
Psych, 3(3), 479-500.
Conceptual introduction
Christensen, A. P., Golino, H., & Silvia, P. J. (2020).
A psychometric network perspective on the validity and validation of personality trait questionnaires.
European Journal of Personality, 34(6), 1095-1108.
See Also
plot.EGAnet
for plot usage in EGAnet
Examples
# Load data
wmt <- wmt2[,7:24]
## Not run:
# Standard EGA example
boot.wmt <- bootEGA(
data = wmt, iter = 500,
type = "parametric", ncores = 2
)
## End(Not run)
# Standard item stability
wmt.is <- itemStability(boot.wmt)
## Not run:
# EGA fit example
boot.wmt.fit <- bootEGA(
data = wmt, iter = 500,
EGA.type = "EGA.fit",
type = "parametric", ncores = 2
)
# EGA fit item stability
wmt.is.fit <- itemStability(boot.wmt.fit)
# Hierarchical EGA example
boot.wmt.hier <- bootEGA(
data = wmt, iter = 500,
EGA.type = "hierEGA",
type = "parametric", ncores = 2
)
# Hierarchical EGA item stability
wmt.is.hier <- itemStability(boot.wmt.hier)
# Random-intercept EGA example
boot.wmt.ri <- bootEGA(
data = wmt, iter = 500,
EGA.type = "riEGA",
type = "parametric", ncores = 2
)
# Random-intercept EGA item stability
wmt.is.ri <- itemStability(boot.wmt.ri)
## End(Not run)
Jensen-Shannon Distance
Description
Computes the Jensen-Shannon Distance between two networks
Usage
jsd(network1, network2, method = c("kld", "spectral"), signed = TRUE)
Arguments
network1 |
Matrix or data frame. Network to be compared |
network2 |
Matrix or data frame. Second network to be compared |
method |
Character (length = 1).
Method to compute Jensen-Shannon Distance.
Defaults to
|
signed |
Boolean. (length = 1).
Should networks be remain signed?
Defaults to |
Value
Returns Jensen-Shannon Distance
Author(s)
Hudson Golino <hfg9s at virginia.edu> & Alexander P. Christensen <alexander.christensen at Vanderbilt.Edu>
Examples
# Obtain wmt2 data
wmt <- wmt2[,7:24]
# Set seed (for reproducibility)
set.seed(1234)
# Split data
split1 <- sample(
1:nrow(wmt), floor(nrow(wmt) / 2)
)
split2 <- setdiff(1:nrow(wmt), split1)
# Obtain split data
data1 <- wmt[split1,]
data2 <- wmt[split2,]
# Perform EBICglasso
glas1 <- EBICglasso.qgraph(data1)
glas2 <- EBICglasso.qgraph(data2)
# Spectral JSD
jsd(glas1, glas2)
# 0.1595893
# Spectral JSS (similarity)
1 - jsd(glas1, glas2)
# 0.8404107
# Jensen-Shannon Divergence
jsd(glas1, glas2, method = "kld")
# 0.1393621
Computes the (Signed) Modularity Statistic
Description
Computes (signed) modularity statistic given a network and community structure. Allows the resolution parameter to be set
Usage
modularity(network, memberships, resolution = 1, signed = FALSE)
Arguments
network |
Matrix or data frame. A symmetric matrix representing a network |
memberships |
Numeric (length = |
resolution |
Numeric (length = 1).
A parameter that adjusts modularity to
prefer smaller ( |
signed |
Boolean (length = 1).
Whether signed or absolute modularity should be computed.
The most common modularity metric is defined by positive values only.
Gomez et al. (2009) introduced a signed version of modularity that
will discount modularity for edges with negative values. This property
isn't always desired for psychometric networks. If |
Value
Returns the modularity statistic
Author(s)
Alexander P. Christensen <alexpaulchristensen@gmail.com> with assistance from GPT-4
References
Gomez, S., Jensen, P., & Arenas, A. (2009). Analysis of community structure in networks of correlated data. Physical Review E, 80(1), 016114.
Examples
# Load data
wmt <- wmt2[,7:24]
# Estimate EGA
ega.wmt <- EGA(wmt, model = "glasso")
# Compute standard (absolute values) modularity
modularity(
network = ega.wmt$network,
memberships = ega.wmt$wc,
signed = FALSE
)
# 0.1697952
# Compute signed modularity
modularity(
network = ega.wmt$network,
memberships = ega.wmt$wc,
signed = TRUE
)
# 0.1701946
Network Loadings
Description
Computes the between- and within-community
strength
of each variable for each community
Usage
net.loads(
A,
wc,
loading.method = c("original", "revised"),
scaling = 2,
rotation = NULL,
...
)
Arguments
A |
Network matrix, data frame, or |
wc |
Numeric or character vector (length = |
loading.method |
Character (length = 1).
Sets network loading calculation based on implementation
described in |
scaling |
Numeric (length = 1).
Scaling factor for the magnitude of the |
rotation |
Character.
A rotation to use to obtain a simpler structure.
For a list of rotations, see |
... |
Additional arguments to pass on to |
Details
Simulation studies have demonstrated that a node's strength centrality is roughly equivalent to factor loadings (Christensen & Golino, 2021; Hallquist, Wright, & Molenaar, 2019). Hallquist and colleagues (2019) found that node strength represented a combination of dominant and cross-factor loadings. This function computes each node's strength within each specified dimension, providing a rough equivalent to factor loadings (including cross-loadings; Christensen & Golino, 2021).
Value
Returns a list containing:
unstd |
A matrix of the unstandardized within- and between-community strength values for each node |
std |
A matrix of the standardized within- and between-community strength values for each node |
rotated |
|
Author(s)
Alexander P. Christensen <alexpaulchristensen@gmail.com> and Hudson Golino <hfg9s at virginia.edu>
References
Original implementation and simulation
Christensen, A. P., & Golino, H. (2021).
On the equivalency of factor and network loadings.
Behavior Research Methods, 53, 1563-1580.
Demonstration of node strength similarity to CFA loadings
Hallquist, M., Wright, A. C. G., & Molenaar, P. C. M. (2019).
Problems with centrality measures in psychopathology symptom networks: Why network psychometrics cannot escape psychometric theory.
Multivariate Behavioral Research, 1-25.
Revised network loadings
Christensen, A. P., Golino, H., Abad, F. J., & Garrido, L. E. (2024).
Revised network loadings.
PsyArXiv.
Examples
# Load data
wmt <- wmt2[,7:24]
# Estimate EGA
ega.wmt <- EGA(
data = wmt,
plot.EGA = FALSE # No plot for CRAN checks
)
# Network loadings
net.loads(ega.wmt)
Network Scores
Description
This function computes network scores computed based on
each node's strength
within each community in the network
(see net.loads
). These values are used as "network loadings"
for the weights of each variable.
Network scores are computed as a formative composite rather than a reflective factor. This composite representation is consistent with no latent factors that psychometric network theory proposes.
Scores can be computed as a "simple" structure, which is equivalent to a weighted sum scores or as a "full" structure, which is equivalent to an EFA approach. Conservatively, the "simple" structure approach is recommended until further validation
Usage
net.scores(
data,
A,
wc,
loading.method = c("original", "revised"),
rotation = NULL,
scores = c("Anderson", "Bartlett", "components", "Harman", "network", "tenBerge",
"Thurstone"),
loading.structure = c("simple", "full"),
impute = c("mean", "median", "none"),
...
)
Arguments
data |
Matrix or data frame. Should consist only of variables to be used in the analysis |
A |
Network matrix, data frame, or |
wc |
Numeric or character vector (length = |
loading.method |
Character (length = 1).
Sets network loading calculation based on implementation
described in |
rotation |
Character.
A rotation to use to obtain a simpler structure.
For a list of rotations, see |
scores |
Character (length = 1).
How should scores be estimated?
Defaults to |
loading.structure |
Character (length = 1).
Whether simple structure or the saturated loading matrix
should be used when computing scores.
Defaults to
Simple structure is the more "conservative" (established) approach
and is therefore the default. Treat |
impute |
Character (length = 1). If there are any missing data, then imputation can be implemented. Available options:
|
... |
Additional arguments to be passed on to
|
Value
Returns a list containing:
scores |
A list containing the standardized ( |
loadings |
Output from |
Author(s)
Alexander P. Christensen <alexpaulchristensen@gmail.com> and Hudson F. Golino <hfg9s at virginia.edu>
References
Original implementation and simulation for loadings
Christensen, A. P., & Golino, H. (2021).
On the equivalency of factor and network loadings.
Behavior Research Methods, 53, 1563-1580.
Preliminary simulation for scores
Golino, H., Christensen, A. P., Moulder, R., Kim, S., & Boker, S. M. (2021).
Modeling latent topics in social media using Dynamic Exploratory Graph Analysis: The case of the right-wing and left-wing trolls in the 2016 US elections.
Psychometrika.
Revised network loadings
Christensen, A. P., Golino, H., Abad, F. J., & Garrido, L. E. (2024).
Revised network loadings.
PsyArXiv.
Examples
# Load data
wmt <- wmt2[,7:24]
# Estimate EGA
ega.wmt <- EGA(
data = wmt,
plot.EGA = FALSE # No plot for CRAN checks
)
# Network scores
net.scores(data = wmt, A = ega.wmt)
Compares Network Structures Using Permutation
Description
A permutation implementation to determine statistical significance of whether the network structures are different from one another
Usage
network.compare(
base,
comparison,
corr = c("auto", "cor_auto", "pearson", "spearman"),
na.data = c("pairwise", "listwise"),
model = c("BGGM", "glasso", "TMFG"),
iter = 1000,
ncores,
verbose = TRUE,
seed = NULL,
...
)
Arguments
base |
Matrix or data frame. Should consist only of variables to be used in the analysis. First dataset |
comparison |
Matrix or data frame. Should consist only of variables to be used in the analysis. Second dataset |
corr |
Character (length = 1).
Method to compute correlations.
Defaults to
For other similarity measures, compute them first and input them
into |
na.data |
Character (length = 1).
How should missing data be handled?
Defaults to
|
model |
Character (length = 1).
Defaults to
|
iter |
Numeric (length = 1).
Number of permutations to perform.
Defaults to |
ncores |
Numeric (length = 1).
Number of cores to use in computing results.
Defaults to |
verbose |
Boolean (length = 1).
Should progress be displayed?
Defaults to |
seed |
Numeric (length = 1).
Defaults to |
... |
Additional arguments that can be passed on to
|
Value
Returns a list:
network |
Data frame with row names of each measure, empirical value ( |
edges |
List containing matrices of values for empirical values ( |
Author(s)
Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
References
Frobenius Norm
Ulitzsch, E., Khanna, S., Rhemtulla, M., & Domingue, B. W. (2023).
A graph theory based similarity metric enables comparison of subpopulation psychometric networks.
Psychological Methods.
Jensen-Shannon Similarity (1 - Distance)
De Domenico, M., Nicosia, V., Arenas, A., & Latora, V. (2015).
Structural reducibility of multilayer networks.
Nature Communications, 6(1), 1–9.
Total Network Strength
van Borkulo, C. D., van Bork, R., Boschloo, L., Kossakowski, J. J., Tio, P., Schoevers, R. A., Borsboom, D., & Waldorp, L. J. (2023).
Comparing network structures on three aspects: A permutation test.
Psychological Methods, 28(6), 1273–1285.
Examples
# Load data
wmt <- wmt2[,7:24]
# Set groups (if necessary)
groups <- rep(1:2, each = nrow(wmt) / 2)
# Groups
group1 <- wmt[groups == 1,]
group2 <- wmt[groups == 2,]
## Not run: # Perform comparison
results <- network.compare(group1, group2)
# Print results
print(results)
# Plot edge differences
plot(results)
## End(Not run)
Apply a Network Estimation Method
Description
General function to apply network estimation methods in EGAnet
Usage
network.estimation(
data,
n = NULL,
corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
na.data = c("pairwise", "listwise"),
model = c("BGGM", "glasso", "TMFG"),
network.only = TRUE,
verbose = FALSE,
...
)
Arguments
data |
Matrix or data frame. Should consist only of variables to be used in the analysis |
n |
Numeric (length = 1).
Sample size if |
corr |
Character (length = 1).
Method to compute correlations.
Defaults to
For other similarity measures, compute them first and input them
into |
na.data |
Character (length = 1).
How should missing data be handled?
Defaults to
|
model |
Character (length = 1).
Defaults to
|
network.only |
Boolean (length = 1).
Whether the network only should be output.
Defaults to |
verbose |
Boolean (length = 1).
Whether messages and (insignificant) warnings should be output.
Defaults to |
... |
Additional arguments to be passed on to
|
Value
Returns a matrix populated with a network from the input data
Author(s)
Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
References
Graphical Least Absolute Shrinkage and Selection Operator (GLASSO)
Friedman, J., Hastie, T., & Tibshirani, R. (2008).
Sparse inverse covariance estimation with the graphical lasso.
Biostatistics, 9(3), 432–441.
GLASSO with Extended Bayesian Information Criterion (EBICglasso)
Epskamp, S., & Fried, E. I. (2018).
A tutorial on regularized partial correlation networks.
Psychological Methods, 23(4), 617–634.
Bayesian Gaussian Graphical Model (BGGM)
Williams, D. R. (2021).
Bayesian estimation for Gaussian graphical models: Structure learning, predictability, and network comparisons.
Multivariate Behavioral Research, 56(2), 336–352.
Triangulated Maximally Filtered Graph (TMFG)
Massara, G. P., Di Matteo, T., & Aste, T. (2016).
Network filtering for big data: Triangulated maximally filtered graph.
Journal of Complex Networks, 5, 161-178.
Examples
# Load data
wmt <- wmt2[,7:24]
# EBICglasso (default for EGA functions)
glasso_network <- network.estimation(
data = wmt, model = "glasso"
)
# TMFG
tmfg_network <- network.estimation(
data = wmt, model = "TMFG"
)
GLASSO with Non-convex Penalties
Description
The graphical least absolute shrinkage and selection operator with a non-convex regularization penalties
Usage
network.nonconvex(
data,
n = NULL,
corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
na.data = c("pairwise", "listwise"),
penalty = c("iPOT", "LGP", "POP", "SPOT"),
gamma = NULL,
lambda = NULL,
nlambda = 50,
lambda.min.ratio = 0.01,
penalize.diagonal = TRUE,
optimize.over = c("none", "lambda", "both"),
ic = c("AIC", "AICc", "BIC", "EBIC"),
ebic.gamma = 0.5,
fast = TRUE,
verbose = FALSE,
...
)
Arguments
data |
Matrix or data frame. Should consist only of variables to be used in the analysis |
n |
Numeric (length = 1).
Sample size must be provided if |
corr |
Character (length = 1).
Method to compute correlations.
Defaults to
For other similarity measures, compute them first and input them
into |
na.data |
Character (length = 1).
How should missing data be handled?
Defaults to
|
penalty |
Character (length = 1). Available options:
|
gamma |
Numeric (length = 1). Adjusts the shape of the penalty. Defaults:
|
lambda |
Numeric (length = 1). Adjusts the initial penalty provided to the non-convex penalty function |
nlambda |
Numeric (length = 1).
Number of lambda values to test.
Defaults to |
lambda.min.ratio |
Numeric (length = 1).
Ratio of lowest lambda value compared to maximal lambda.
Defaults to |
penalize.diagonal |
Boolean (length = 1).
Should the diagonal be penalized?
Defaults to |
optimize.over |
Character (length = 1).
Whether optimization of lambda, gamma, both, or no hyperparamters should be performed.
Defaults to |
ic |
Character (length = 1). What information criterion should be used for model selection? Available options include:
Term definitions:
Defaults to |
ebic.gamma |
Numeric (length = 1)
Value to set gamma parameter in EBIC (see above).
Defaults to Only used if |
fast |
Boolean (length = 1).
Whether the The fast results may differ by less than floating point of the original
GLASSO implemented by |
verbose |
Boolean (length = 1).
Whether messages and (insignificant) warnings should be output.
Defaults to |
... |
Additional arguments to be passed on to |
Value
A network matrix
Author(s)
Alexander P. Christensen <alexpaulchristensen at gmail.com> and Hudson Golino <hfg9s at virginia.edu>
Examples
# Obtain data
wmt <- wmt2[,7:24]
# Obtain network
awe_network <- network.nonconvex(data = wmt)
Predict New Data based on Network
Description
General function to compute a network's predictive power on new data, following Haslbeck and Waldorp (2018) and Williams and Rodriguez (2022)
This implementation is different from the predictability
in the mgm
package
(Haslbeck), which is based on (regularized) regression. This implementation uses
the network directly, converting the partial correlations into an implied
precision (inverse covariance) matrix. See Details for more information
Usage
network.predictability(network, original.data, newdata, ordinal.categories = 7)
Arguments
network |
Matrix or data frame. A partial correlation network |
original.data |
Matrix or data frame.
Must consist only of variables to be used to estimate the |
newdata |
Matrix or data frame.
Must consist of the same variables in the same order as |
ordinal.categories |
Numeric (length = 1).
Up to the number of categories before a variable is considered continuous.
Defaults to |
Details
This implementation of network predictability proceeds in several steps with important assumptions:
1. Network was estimated using (partial) correlations (not regression like the
mgm
package!)
2. Original data that was used to estimate the network in 1. is necessary to apply the original scaling to the new data
3. (Linear) regression-like coefficients are obtained by reserve engineering the
inverse covariance matrix using the network's partial correlations (i.e.,
by setting the diagonal of the network to -1 and computing the inverse
of the opposite signed partial correlation matrix; see EGAnet:::pcor2inv
)
4. Predicted values are obtained by matrix multiplying the new data with these coefficients
5. Dichotomous and polytomous data are given categorical values based on the original data's thresholds and these thresholds are used to convert the continuous predicted values into their corresponding categorical values
6. Evaluation metrics:
dichotomous —
"Accuracy"
or the percent correctly predicted for the 0s and 1s and"Kappa"
or Cohen's Kappa (see cite)polytomous —
"Linear Kappa"
or linearly weighted Kappa and"Krippendorff's alpha"
(see cite)continuous — R-squared (
"R2"
) and root mean square error ("RMSE"
)
Value
Returns a list containing:
predictions |
Predicted values of |
betas |
Beta coefficients derived from the |
results |
Performance metrics for each variable in |
Author(s)
Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
References
Original Implementation of Node Predictability
Haslbeck, J. M., & Waldorp, L. J. (2018).
How well do network models predict observations? On the importance of predictability in network models.
Behavior Research Methods, 50(2), 853–861.
Derivation of Regression Coefficients Used (Formula 3)
Williams, D. R., & Rodriguez, J. E. (2022).
Why overfitting is not (usually) a problem in partial correlation networks.
Psychological Methods, 27(5), 822–840.
Cohen's Kappa
Cohen, J. (1960). A coefficient of agreement for nominal scales.
Educational and Psychological Measurement, 20(1), 37-46.
Cohen, J. (1968). Weighted kappa: nominal scale agreement provision for scaled disagreement or partial credit. Psychological Bulletin, 70(4), 213-220.
Krippendorff's alpha
Krippendorff, K. (2013).
Content analysis: An introduction to its methodology (3rd ed.).
Thousand Oaks, CA: Sage.
Examples
# Load data
wmt <- wmt2[,7:24]
# Set seed (to reproduce results)
set.seed(42)
# Split data
training <- sample(
1:nrow(wmt), round(nrow(wmt) * 0.80) # 80/20 split
)
# Set splits
wmt_train <- wmt[training,]
wmt_test <- wmt[-training,]
# EBICglasso (default for EGA functions)
glasso_network <- network.estimation(
data = wmt_train, model = "glasso"
)
# Check predictability
network.predictability(
network = glasso_network, original.data = wmt_train,
newdata = wmt_test
)
Optimism Data
Description
A response matrix (n = 282) containing responses to 10 items of the Revised Life Orientation Test (LOT-R), developed by Scheier, Carver, & Bridges (1994).
Usage
data(optimism)
Format
A 282x10 response matrix
References
Scheier, M. F., Carver, C. S., & Bridges, M. W. (1994). Distinguishing optimism from neuroticism (and trait anxiety, self-mastery, and self-esteem): a reevaluation of the Life Orientation Test. Journal of Personality and Social Psychology, 67, 1063-1078.
Examples
data("optimism")
Computes Polychoric Correlations
Description
A fast implementation of polychoric correlations in C. Uses the Beasley-Springer-Moro algorithm (Boro & Springer, 1977; Moro, 1995) to estimate the inverse univariate normal CDF, the Drezner-Wesolosky approximation (Drezner & Wesolosky, 1990) to estimate the bivariate normal CDF, and Brent's method (Brent, 2013) for optimization of rho
Usage
polychoric.matrix(
data,
na.data = c("pairwise", "listwise"),
empty.method = c("none", "zero", "all"),
empty.value = c("none", "point_five", "one_over"),
...
)
Arguments
data |
Matrix or data frame.
A dataset with all ordinal values
(rows = cases, columns = variables).
Data are required to be between |
na.data |
Character (length = 1).
How should missing data be handled?
Defaults to
|
empty.method |
Character (length = 1). Method for empty cell correction. Available options:
|
empty.value |
Character (length = 1). Value to add to the joint frequency table cells. Accepts numeric values between 0 and 1 or specific methods:
|
... |
Not used but made available for easier argument passing |
Value
Returns a polychoric correlation matrix
Author(s)
Alexander P. Christensen <alexpaulchristensen@gmail.com> with assistance from GPT-4
References
Beasley-Moro-Springer algorithm
Beasley, J. D., & Springer, S. G. (1977).
Algorithm AS 111: The percentage points of the normal distribution.
Journal of the Royal Statistical Society. Series C (Applied Statistics), 26(1), 118-121.
Moro, B. (1995). The full monte. Risk 8 (February), 57-58.
Brent optimization
Brent, R. P. (2013).
Algorithms for minimization without derivatives.
Mineola, NY: Dover Publications, Inc.
Drezner-Wesolowsky bivariate normal approximation
Drezner, Z., & Wesolowsky, G. O. (1990).
On the computation of the bivariate normal integral.
Journal of Statistical Computation and Simulation, 35(1-2), 101-107.
Examples
# Load data (ensure matrix for missing data example)
wmt <- as.matrix(wmt2[,7:24])
# Compute polychoric correlation matrix
correlations <- polychoric.matrix(wmt)
# Randomly assign missing data
wmt[sample(1:length(wmt), 1000)] <- NA
# Compute polychoric correlation matrix
# with pairwise missing
pairwise_correlations <- polychoric.matrix(
wmt, na.data = "pairwise"
)
# Compute polychoric correlation matrix
# with listwise missing
pairwise_correlations <- polychoric.matrix(
wmt, na.data = "listwise"
)
Prime Numbers through 100,000
Description
Numeric vector of primes generated from the primes package. Used in
the function [EGAnet]{ergoInfo}
. Not for general use
Usage
data(prime.num)
Format
A 1185x24 response matrix
Examples
data("prime.num")
Random-Intercept EGA
Description
Estimates the number of substantive dimensions after controlling for wording effects. EGA is applied to a residual correlation matrix after subtracting and random intercept factor with equal unstandardized loadings from all the regular and unrecoded reversed items in the database
Usage
riEGA(
data,
n = NULL,
corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
na.data = c("pairwise", "listwise"),
model = c("glasso", "TMFG"),
algorithm = c("leiden", "louvain", "walktrap"),
uni.method = c("expand", "LE", "louvain"),
plot.EGA = TRUE,
verbose = FALSE,
...
)
Arguments
data |
Matrix or data frame. Should consist only of variables to be used in the analysis. Must be raw data and not a correlation matrix |
n |
Numeric (length = 1).
Sample size if |
corr |
Character (length = 1).
Method to compute correlations.
Defaults to
For other similarity measures, compute them first and input them
into |
na.data |
Character (length = 1).
How should missing data be handled?
Defaults to
|
model |
Character (length = 1).
Defaults to
|
algorithm |
Character or
|
uni.method |
Character (length = 1).
What unidimensionality method should be used?
Defaults to
|
plot.EGA |
Boolean (length = 1).
If |
verbose |
Boolean (length = 1).
Whether messages and (insignificant) warnings should be output.
Defaults to |
... |
Additional arguments to be passed on to
|
Value
Returns a list containing:
EGA |
Results from |
RI |
A list containing information about the random-intercept model (if the model converged): |
TEFI |
|
plot.EGA |
Plot output if |
Author(s)
Alejandro Garcia-Pardina <alejandrogp97@gmail.com>, Francisco J. Abad <fjose.abad@uam.es>, Alexander P. Christensen <alexpaulchristensen@gmail.com>, Hudson Golino <hfg9s at virginia.edu>, Luis Eduardo Garrido <luisgarrido@pucmm.edu.do>, and Robert Moulder <rgm4fd@virginia.edu>
References
Selection of CFA Estimator
Rhemtulla, M., Brosseau-Liard, P. E., & Savalei, V. (2012).
When can categorical variables be treated as continuous? A comparison of robust continuous and categorical SEM estimation methods under suboptimal conditions.
Psychological Methods, 17, 354-373.
See Also
plot.EGAnet
for plot usage in EGAnet
Examples
# Obtain example data
wmt <- wmt2[,7:24]
# riEGA example
riEGA(data = wmt, plot.EGA = FALSE)
# no plot for CRAN checks
sim.dynEGA Data
Description
A simulated (multivariate time series) data with 24 variables, 100 individual observations, 50 time points per individual and 2 groups of individuals
Usage
data(sim.dynEGA)
Format
A 5000 x 26 multivariate time series
Details
Data were generated using the simDFM
function
with the following arguments:
Group 1
simDFM(
variab = 12, timep = 50,
nfact = 2, error = 0.125,
dfm = "DAFS",
loadings = EGAnet:::runif_xoshiro(
1, min = 0.50, max = 0.70
), autoreg = 0.80, crossreg = 0.00,
var.shock = 0.36, cov.shock = 0.18
)
Group 2
simDFM(
variab = 8, timep = 50,
nfact = 3, error = 0.125,
dfm = "DAFS",
loadings = EGAnet:::runif_xoshiro(
1, min = 0.50, max = 0.70
), autoreg = 0.80, crossreg = 0.00,
var.shock = 0.36, cov.shock = 0.18
)
Examples
data("sim.dynEGA")
Simulate data following a Dynamic Factor Model
Description
Function to simulate data following a dynamic factor model (DFM). Two DFMs are currently available: the direct autoregressive factor score model (Engle & Watson, 1981; Nesselroade, McArdle, Aggen, and Meyers, 2002) and the dynamic factor model with random walk factor scores.
Usage
simDFM(
variab,
timep,
nfact,
error,
dfm = c("DAFS", "RandomWalk"),
loadings,
autoreg,
crossreg,
var.shock,
cov.shock,
burnin = 1000
)
Arguments
variab |
Number of variables per factor. |
timep |
Number of time points. |
nfact |
Number of factors. |
error |
Value to be used to construct a diagonal matrix Q. This matrix is p x p covariance matrix Q that will generate random errors following a multivariate normal distribution with mean zeros. The value provided is squared before constructing Q. |
dfm |
A string indicating the dynamical factor model to use. Current options are:
|
loadings |
Magnitude of the loadings. |
autoreg |
Magnitude of the autoregression coefficients. |
crossreg |
Magnitude of the cross-regression coefficients. |
var.shock |
Magnitude of the random shock variance. |
cov.shock |
Magnitude of the random shock covariance |
burnin |
Number of n first samples to discard when computing the factor scores. Defaults to 1000. |
Author(s)
Hudson F. Golino <hfg9s at virginia.edu>
References
Engle, R., & Watson, M. (1981). A one-factor multivariate time series model of metropolitan wage rates. Journal of the American Statistical Association, 76(376), 774-781.
Nesselroade, J. R., McArdle, J. J., Aggen, S. H., & Meyers, J. M. (2002). Dynamic factor analysis models for representing process in multivariate time-series. In D. S. Moskowitz & S. L. Hershberger (Eds.), Multivariate applications book series. Modeling intraindividual variability with repeated measures data: Methods and applications, 235-265.
Examples
## Not run:
# Estimate EGA network
data1 <- simDFM(variab = 5, timep = 50, nfact = 3, error = 0.05,
dfm = "DAFS", loadings = 0.7, autoreg = 0.8,
crossreg = 0.1, var.shock = 0.36,
cov.shock = 0.18, burnin = 1000)
## End(Not run)
Simulate data following a Exploratory Graph Model (EGM
)
Description
Function to simulate data based on EGM
Usage
simEGM(
communities,
variables,
loadings,
cross.loadings = 0.02,
correlations,
sample.size,
max.iterations = 1000
)
Arguments
communities |
Numeric (length = 1). Number of communities to generate |
variables |
Numeric vector (length = 1 or |
loadings |
Numeric (length = 1). Magnitude of the assigned network loadings. For reference, small (0.20), moderate (0.35), and large (0.50) Uses |
cross.loadings |
Numeric (length = 1).
Standard deviation of a normal distribution with a mean of zero ( |
correlations |
Numeric (length = 1). Magnitude of the community correlations |
sample.size |
Numeric (length = 1). Number of observations to generate |
max.iterations |
Numeric (length = 1).
Number of iterations to attempt to get convergence before erroring out.
Defaults to |
Author(s)
Hudson F. Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
Examples
simulated <- simEGM(
communities = 2, variables = 6,
loadings = 0.55, # use standard factor loading sizes
correlations = 0.30,
sample.size = 1000
)
Total Entropy Fit Index using Von Neumman's entropy (Quantum Information Theory) for correlation matrices
Description
Computes the fit (TEFI) of a dimensionality structure using Von Neumman's entropy when the input is a correlation matrix. Lower values suggest better fit of a structure to the data.
Usage
tefi(data, structure = NULL, verbose = TRUE)
Arguments
data |
Matrix, data frame, or |
structure |
Numeric or character vector (length = |
verbose |
Boolean (length = 1).
Whether messages and (insignificant) warnings should be output.
Defaults to |
Value
Returns a data frame with columns:
Non-hierarchical Structure
VN.Entropy.Fit |
The Total Entropy Fit Index using Von Neumman's entropy |
Total.Correlation |
The total correlation of the dataset |
Average.Entropy |
The average entropy of the dataset |
Hierarchical Structure
VN.Entropy.Fit |
The Generalized Total Entropy Fit Index using Von Neumman's entropy |
Lower.Order.VN |
Lower order (only) Total Entropy Fit Index |
Higher.Order.VN |
Higher order (only) Total Entropy Fit Index |
Author(s)
Hudson Golino <hfg9s at virginia.edu>, Alexander P. Christensen <alexpaulchristensen@gmail.com>, and Robert Moulder <rgm4fd@virginia.edu>
References
Initial formalization and simulation
Golino, H., Moulder, R. G., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., Nesselroade, J., Sadana, R., Thiyagarajan, J. A., & Boker, S. M. (2020).
Entropy fit indices: New fit measures for assessing the structure and dimensionality of multiple latent variables.
Multivariate Behavioral Research.
Examples
# Load data
wmt <- wmt2[,7:24]
# Estimate EGA model
ega.wmt <- EGA(
data = wmt, model = "glasso",
plot.EGA = FALSE # no plot for CRAN checks
)
# Compute entropy indices for empirical EGA
tefi(ega.wmt)
# User-defined structure (with `EGA` object)
tefi(ega.wmt, structure = c(rep(1, 5), rep(2, 5), rep(3, 8)))
Compare Total Entropy Fit Index (tefi
) Between Two Structures
Description
This function computes the tefi
values for two different structures using
bootstrapped correlation matrices from bootEGA
and compares them using a
non-parametric bootstrap test. It also visualizes the distributions of tefi
values
for both structures.
Usage
tefi.compare(bootega.obj, base, comparison, plot.TEFI = TRUE, ...)
Arguments
bootega.obj |
A |
base |
Numeric (length = columns in original dataset). A vector representing the base structure to be tested |
comparison |
Numeric (length = columns in original dataset).
A vector representing the structure to be compared against the |
plot.TEFI |
Boolean (length = 1).
Whether the TEFI comparison and the p-value should be plotted.
Defaults to |
... |
Additional arguments that can be passed on to |
Details
The null hypothesis is that the TEFI values obtained in the bootstrapped correlation matrices for the base
structure are than the TEFI values obtained in the bootstrapped correlation matrices for the comparison
structure.
Therefore, the p-value in this bootstrap test can be interpreted as follows:
If the p-value less than 0.05: TEFI values for the
base
structure tend to be lower than thecomparison
structure, indicating that the former provides a better fit (lower entropy) than the latterIf the p-value is greater than 0.05: TEFI values for the
base
structure are not significantly lower than thecomparison
structure, suggesting that both structures may provide similar fits or thatcomparison
might fit better
Value
A list containing:
TEFI.df |
A data frame containing the TEFI values for both structures |
p.value |
The p-value from the non-parametric bootstrap hypothesis test |
Author(s)
Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
Examples
# Obtain data
wmt <- wmt2[,7:24]
## Not run:
# Perform bootstrap EGA
boot.wmt <- bootEGA(
data = wmt, iter = 500,
type = "parametric", ncores = 2
)
## End(Not run)
# Perform comparison
comparing_tefi <- tefi.compare(
boot.wmt,
base = boot.wmt$EGA$wc, # Compare Walktrap
comparison = community.detection(
boot.wmt$EGA$network, algorithm = "louvain"
) # With Louvain
)
# Plot options (UVa colors)
plot(
comparing_tefi,
base.name = "Walktrap", base.color = "#232D4B",
comparison.name = "Louvain", comparison.color = "#E57200"
)
Total Correlation
Description
Computes the total correlation of a dataset
Usage
totalCor(data, base = 2.718282)
Arguments
data |
Matrix or data frame. Should consist only of variables to be used in the analysis |
base |
Numeric (length = 1).
Base to use for entropy.
Defaults to |
Value
Returns a list containing:
Ind.Entropies |
Individual entropies for each variable |
Joint.Entropy |
The joint entropy of the dataset |
Total.Cor |
The total correlation of the dataset |
Normalized |
Total correlation divided by the sum of the individual entropies minus the maximum of the individual entropies |
Author(s)
Hudson F. Golino <hfg9s at virginia.edu>
References
Formalization of total correlation
Watanabe, S. (1960).
Information theoretical analysis of multivariate correlation.
IBM Journal of Research and Development 4, 66-82.
Applied implementation
Felix, L. M., Mansur-Alves, M., Teles, M., Jamison, L., & Golino, H. (2021).
Longitudinal impact and effects of booster sessions in a cognitive training program for healthy older adults.
Archives of Gerontology and Geriatrics, 94, 104337.
Examples
# Compute total correlation
totalCor(wmt2[,7:24])
Total Correlation Matrix
Description
Computes the pairwise total correlation
(totalCor
) for a dataset
Usage
totalCorMat(data, base = 2.718282, normalized = FALSE)
Arguments
data |
Matrix or data frame. Should consist only of variables to be used in the analysis |
base |
Numeric (length = 1).
Base to use for entropy.
Defaults to |
normalized |
Boolean (length = 1).
Should the normalized total correlation be computed?
Defaults to |
Value
Returns a symmetric matrix with pairwise total correlations
Author(s)
Hudson F. Golino <hfg9s at virginia.edu>
References
Formalization of total correlation
Watanabe, S. (1960).
Information theoretical analysis of multivariate correlation.
IBM Journal of Research and Development 4, 66-82.
Applied implementation
Felix, L. M., Mansur-Alves, M., Teles, M., Jamison, L., & Golino, H. (2021).
Longitudinal impact and effects of booster sessions in a cognitive training program for healthy older adults.
Archives of Gerontology and Geriatrics, 94, 104337.
Examples
# Compute total correlation matrix
totalCorMat(wmt2[,7:24])
Entropy Fit Index using Von Neumman's entropy (Quantum Information Theory) for correlation matrices
Description
Computes the fit of a dimensionality structure using Von Neumman's entropy when the input is a correlation matrix. Lower values suggest better fit of a structure to the data
Usage
vn.entropy(data, structure)
Arguments
data |
Matrix or data frame. Contains variables to be used in the analysis |
structure |
Numeric or character vector (length = |
Value
Returns a list containing:
VN.Entropy.Fit |
The Entropy Fit Index using Von Neumman's entropy |
Total.Correlation |
The total correlation of the dataset |
Average.Entropy |
The average entropy of the dataset |
Author(s)
Hudson Golino <hfg9s at virginia.edu>, Alexander P. Christensen <alexpaulchristensen@gmail.com>, and Robert Moulder <rgm4fd@virginia.edu>
References
Initial formalization and simulation
Golino, H., Moulder, R. G., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., Nesselroade, J., Sadana, R., Thiyagarajan, J. A., & Boker, S. M. (2020).
Entropy fit indices: New fit measures for assessing the structure and dimensionality of multiple latent variables.
Multivariate Behavioral Research.
Examples
# Get EGA result
ega.wmt <- EGA(
data = wmt2[,7:24], model = "glasso",
plot.EGA = FALSE # no plot for CRAN checks
)
# Compute Von Neumman entropy
vn.entropy(ega.wmt$correlation, ega.wmt$wc)
WMT-2 Data
Description
A response matrix (n = 1185) of the Wiener Matrizen-Test 2 (WMT-2).
Usage
data(wmt2)
Format
A 1185x24 response matrix
Examples
data("wmt2")
Weighted Topological Overlap
Description
Computes weighted topological overlap following the Novick et al. (2009) definition
Usage
wto(network, signed = TRUE, diagonal.zero = TRUE)
Arguments
network |
Symmetric matrix or data frame. A symmetric network |
signed |
Boolean (length = 1).
Whether the signed version should be used.
Defaults to |
diagonal.zero |
Boolean (length = 1).
Whether diagonal of overlap matrix should be set to zero.
Defaults to |
Value
A symmetric matrix of weighted topological overlap values between each pair of variables
References
Original formalization
Nowick, K., Gernat, T., Almaas, E., & Stubbs, L. (2009).
Differences in human and chimpanzee gene expression patterns define an evolving network of transcription factors in brain.
Proceedings of the National Academy of Sciences, 106, 22358-22363.
Examples
# Obtain network
network <- network.estimation(wmt2[,7:24], model = "glasso")
# Compute wTO
wto(network)