| Type: | Package |
| Title: | Collaborative Graphical Lasso - Multi-Omics Network Reconstruction |
| Version: | 1.1.0 |
| Description: | Reconstruct networks from multi-omics data sets with the collaborative graphical lasso (coglasso) algorithm described in Albanese, A., Kohlen, W., and Behrouzi, P. (2024) <doi:10.48550/arXiv.2403.18602>. Use the main wrapper function 'bs()' to build and select a multi-omics network. |
| URL: | https://github.com/DrQuestion/coglasso, https://drquestion.github.io/coglasso/ |
| BugReports: | https://github.com/DrQuestion/coglasso/issues |
| License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
| Imports: | igraph, lifecycle, Matrix, Rcpp (≥ 1.0.11), rlang, stats, utils, withr |
| LinkingTo: | Rcpp, RcppEigen |
| Depends: | R (≥ 2.10) |
| LazyData: | true |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.2 |
| Suggests: | knitr, rmarkdown, testthat (≥ 3.0.0) |
| Config/testthat/edition: | 3 |
| VignetteBuilder: | knitr |
| NeedsCompilation: | yes |
| Packaged: | 2025-10-28 10:38:10 UTC; alban003 |
| Author: | Alessio Albanese |
| Maintainer: | Alessio Albanese <alessio.albanese@wur.nl> |
| Repository: | CRAN |
| Date/Publication: | 2025-10-28 11:00:03 UTC |
coglasso: Collaborative Graphical Lasso - Multi-Omics Network Reconstruction
Description
Reconstruct networks from multi-omics data sets with the collaborative graphical lasso (coglasso) algorithm described in Albanese, A., Kohlen, W., and Behrouzi, P. (2024) doi:10.48550/arXiv.2403.18602. Use the main wrapper function 'bs()' to build and select a multi-omics network.
Author(s)
Maintainer: Alessio Albanese alessio.albanese@wur.nl (ORCID) [copyright holder]
Authors:
Pariya Behrouzi pariya.behrouzi@wur.nl (ORCID)
See Also
Useful links:
Report bugs at https://github.com/DrQuestion/coglasso/issues
Build multiple networks and select the best one from a multi-omics data set
Description
bs() wraps the two main functions of the package in a single one:
coglasso(), to build multiple multi-omics networks, and select_coglasso()
to select the best one according to the chosen criterion.
Usage
bs(
data,
p = NULL,
pX = lifecycle::deprecated(),
lambda_w = NULL,
lambda_b = NULL,
c = NULL,
nlambda_w = NULL,
nlambda_b = NULL,
nc = NULL,
lambda_w_max = NULL,
lambda_b_max = NULL,
c_max = NULL,
lambda_w_min_ratio = NULL,
lambda_b_min_ratio = NULL,
c_min = NULL,
icov_guess = NULL,
cov_output = FALSE,
lock_lambdas = FALSE,
method = "xestars",
stars_thresh = 0.1,
stars_subsample_ratio = NULL,
rep_num = 20,
max_iter = 10,
old_sampling = FALSE,
ebic_gamma = 0.5,
verbose = TRUE
)
Arguments
data |
The input multi-omics data set. Rows should be samples, columns
should be variables. Variables should be grouped by their assay (e.g.
transcripts first, then metabolites). |
p |
A vector with with the number of variables for each omic layer of the
data set (e.g. the number of transcripts, metabolites etc.), in the same
order the layers have in the data set. If given a single number,
|
pX |
|
lambda_w |
A vector of values for the parameter |
lambda_b |
A vector of values for the parameter |
c |
A vector of values for the parameter |
nlambda_w |
The number of requested |
nlambda_b |
The number of requested |
nc |
The number of requested |
lambda_w_max |
The greatest generated |
lambda_b_max |
The greatest generated |
c_max |
The greatest |
lambda_w_min_ratio |
The ratio of the smallest generated |
lambda_b_min_ratio |
The ratio of the smallest generated |
c_min |
The the smallest |
icov_guess |
Use a predetermined inverse covariance matrix as an initial guess for the network estimation. |
cov_output |
Add the estimated variance-covariance matrix to the output. |
lock_lambdas |
Set |
method |
The model selection method to select the best combination of hyperparameters. The available options are "xstars", "xestars" and "eBIC". Defaults to "xestars". |
stars_thresh |
The threshold set for variability of the explored
networks at each iteration of the algorithm. The |
stars_subsample_ratio |
The proportion of samples in the multi-omics
data set to be randomly subsampled to estimate the variability of the
network under the given hyperparameters setting. Defaults to 80% when the
number of samples is smaller than 144, otherwise it defaults to
|
rep_num |
The amount of subsamples of the multi-omics data set used to estimate the variability of the network under the given hyperparameters setting. Defaults to 20. |
max_iter |
The greatest number of times the algorithm is allowed to
choose a new best |
old_sampling |
Perform the same subsampling |
ebic_gamma |
The |
verbose |
Print information regarding the network building and the network selection processes. |
Details
When using bs(), first, coglasso() estimates multiple multi-omics networks
with the algorithm collaborative graphical lasso, one for each combination
of input values for the hyperparameters \lambda_w, \lambda_b and
c. Then, select_coglasso() selects the best combination of
hyperparameters given to coglasso() according to the selected model
selection method. The three availble options that can be set for the argument
method are "xstars", "xestars" and "ebic". For more information on these
selection methods, visit the help page of select_coglasso().
Value
bs() returns an object of S3 class select_coglasso containing
several elements. The most
important is probably sel_adj, the adjacency matrix of the
selected network. Some output elements depend on the chosen model selection
method.
These elements are always returned, and they are the result of network
estimation with coglasso():
-
loglikis a numerical vector containing theloglikelihoods of all the estimated networks. -
densityis a numerical vector containing a measure of the density of all the estimated networks. -
dfis an integer vector containing the degrees of freedom of all the estimated networks. -
convergenceis a binary vector containing whether a network was successfully estimated for the given combination of hyperparameters or not. -
pathis a list containing the adjacency matrices of all the estimated networks. -
icovis a list containing the inverse covariance matrices of all the estimated networks. -
nexplodedis the number of combinations of hyperparameters for whichcoglasso()failed to converge. -
datais the input multi-omics data set. -
hparsis the ordered table of all the combinations of hyperparameters given as input tobs(), with\alpha(\lambda_w+\lambda_b)being the key to sort rows. -
lambda_w,lambda_b, andcare numerical vectors with, respectively, all the\lambda_w,\lambda_b, andcvaluesbs()used. -
pis the vector with the number of variables for each omic layer of the data set. -
Dis the number of omics layers in the data set. -
covoptional, returned whencov_outputis TRUE, is a list containing the variance-covariance matrices of all the estimated networks.
These elements are returned by all selection methods available:
-
sel_index_c,sel_index_lwandsel_index_lbare the indexes of the final selected parametersc,\lambda_wand\lambda_bleading to the most stable sparse network. -
sel_c,sel_lambda_wandsel_lambda_bare the final selected parametersc,\lambda_wand\lambda_bleading to the most stable sparse network. -
sel_adjis the adjacency matrix of the final selected network. -
sel_densityis the density of the final selected network. -
sel_icovis the inverse covariance matrix of the final selected network. -
sel_covoptional, given only whencoglasso()was called withcov_output = TRUE. It is the covariance matrix associated with the final selected network. -
callis the matched call. -
methodis the chosen model selection method.
These are the additional elements returned when choosing "xestars" or "xstars":
-
mergeis the "merged" adjacency matrix, the average of all the adjacency matrices estimated across all the different subsamples for the selected combination of\lambda_w,\lambda_b, andcvalues in the last path explored before convergence. Each entry is a measure of how recurrent the corresponding edge is across the subsamples. -
variability_lw,variability_lbandvariability_care numeric vectors of as many items as the number of\lambda_w,\lambda_b, andcvalues explored. Each item is the variability of the network estimated for the corresponding hyperparameter value, keeping the other two hyperparameters fixed to their selected value. -
sel_variabilityis the variability of the final selected network.
These are the additional elements returned when choosing "ebic":
-
ebic_scoresis a numerical vector containing the eBIC scores for all the hyperparameter combination.
Examples
# Suggested usage: give the input data set, set the values for `p` and the
# number of hyperparameters to explore (to choose how extensively to explore
# the possible hyperparameters). Then, let the default behavior do the rest:
sel_mo_net <- bs(multi_omics_sd_micro, p = c(4, 2), nlambda_w = 3,
nlambda_b = 3, nc = 3, verbose = FALSE)
Estimate networks from a multi-omics data set
Description
coglasso() estimates multiple multi-omics networks with the algorithm
collaborative graphical lasso, one for each combination of input values for
the hyperparameters \lambda_w, \lambda_b and c.
Usage
coglasso(
data,
p = NULL,
pX = lifecycle::deprecated(),
lambda_w = NULL,
lambda_b = NULL,
c = NULL,
nlambda_w = NULL,
nlambda_b = NULL,
nc = NULL,
lambda_w_max = NULL,
lambda_b_max = NULL,
c_max = NULL,
lambda_w_min_ratio = NULL,
lambda_b_min_ratio = NULL,
c_min = NULL,
icov_guess = NULL,
cov_output = FALSE,
lock_lambdas = FALSE,
verbose = TRUE
)
Arguments
data |
The input multi-omics data set. Rows should be samples, columns
should be variables. Variables should be grouped by their assay (e.g.
transcripts first, then metabolites). |
p |
A vector with with the number of variables for each omic layer of the
data set (e.g. the number of transcripts, metabolites etc.), in the same
order the layers have in the data set. If given a single number,
|
pX |
|
lambda_w |
A vector of values for the parameter |
lambda_b |
A vector of values for the parameter |
c |
A vector of values for the parameter |
nlambda_w |
The number of requested |
nlambda_b |
The number of requested |
nc |
The number of requested |
lambda_w_max |
The greatest generated |
lambda_b_max |
The greatest generated |
c_max |
The greatest |
lambda_w_min_ratio |
The ratio of the smallest generated |
lambda_b_min_ratio |
The ratio of the smallest generated |
c_min |
The the smallest |
icov_guess |
Use a predetermined inverse covariance matrix as an initial guess for the network estimation. |
cov_output |
Add the estimated variance-covariance matrix to the output. |
lock_lambdas |
Set |
verbose |
Print information regarding current |
Value
coglasso() returns an object of S3 class coglasso, that has the
following elements:
-
loglikis a numerical vector containing theloglikelihoods of all the estimated networks. -
densityis a numerical vector containing a measure of the density of all the estimated networks. -
dfis an integer vector containing the degrees of freedom of all the estimated networks. -
convergenceis a binary vector containing whether a network was successfully estimated for the given combination of hyperparameters or not. -
pathis a list containing the adjacency matrices of all the estimated networks. -
icovis a list containing the inverse covariance matrices of all the estimated networks. -
nexplodedis the number of combinations of hyperparameters for whichcoglasso()failed to converge. -
datais the input multi-omics data set. -
hparsis the ordered table of all the combinations of hyperparameters given as input tocoglasso(), with\alpha(\lambda_w+\lambda_b)being the key to sort rows. -
lambda_wis a numerical vector with all the\lambda_wvaluescoglasso()used. -
lambda_bis a numerical vector with all the\lambda_bvaluescoglasso()used. -
cis a numerical vector with all thecvaluescoglasso()used. -
pis the vector with the number of variables for each omic layer of the data set. -
Dis the number of omics layers in the data set. -
icov_guessoptional, returned whenicov_guessis given. It is the predetermined inverse covariance matrix given by the user as an initial guess for the network estimation. -
covoptional, returned whencov_outputis TRUE, is a list containing the variance-covariance matrices of all the estimated networks. -
callis the matched call.
Examples
# Typical usage: set the number of hyperparameters to explore
cg <- coglasso(multi_omics_sd_micro,
p = c(4, 2), nlambda_w = 3,
nlambda_b = 3, nc = 3, verbose = FALSE
)
# Model selection using eXtended Efficient StARS, takes less than five seconds
sel_cg_xestars <- select_coglasso(cg, method = "xestars", verbose = FALSE)
Extract a coglasso network
Description
get_network() extracts the selected network from a select_coglasso object,
or a different specific one from either a select_coglasso or a coglasso
object when specifying the optional parameters.
Usage
get_network(sel_cg_obj, index_c = NULL, index_lw = NULL, index_lb = NULL)
Arguments
sel_cg_obj |
The object of |
index_c |
The index of the |
index_lw |
The index of the |
index_lb |
The index of the |
Details
If the input is a coglasso object, it is necessary to specify all the
indexes to extract the chosen network.
If the input is a select_coglasso object, it extracts by default the
selected network. If the selection method was "ebic", and you want to extract
a different network than the selected one, specify all indexes.
Otherwise, if the objective is to extract the optimal network for a specific
c value different than the selected one, set index_c to your chosen
one. Also here it is possible to extract a specific non-optimal network by
setting all the indexes to the chosen ones.
Value
get_network() returns the selected network, in the form of an
object of class igraph.
Examples
sel_cg <- bs(multi_omics_sd_micro, p = c(4, 2), nlambda_w = 3, nlambda_b = 3,
nc = 3, verbose = FALSE)
sel_net <- get_network(sel_cg)
# Could even plot the selected network with plot(sel_net), but then it would
# plot an unnotated network, better to directly plot(sel_cg).
print(sel_net)
Extract a coglasso partial correlation matrix
Description
get_pcor() extracts the selected partial correlation matrix from a
select_coglasso object, or a different specific one from either a
select_coglasso or a coglasso object when specifying the optional
parameters.
Usage
get_pcor(sel_cg_obj, index_c = NULL, index_lw = NULL, index_lb = NULL)
Arguments
sel_cg_obj |
The object of |
index_c |
The index of the |
index_lw |
The index of the |
index_lb |
The index of the |
Details
If the input is a coglasso object, it is necessary to specify all the
indexes to extract the chosen partial correlation matrix.
If the input is a select_coglasso object, it extracts by default the
selected partial correlation matrix. If the selection method was "ebic", and
you want to extract a different partial correlation matrix than the selected
one, specify all indexes. Otherwise, if the objective is to extract the
optimal partial correlation matrix for a specific c value different
than the selected one, set index_c to your chosen one. Also here it is
possible to extract a specific non-optimal partial correlation matrix by
setting all the indexes to the chosen ones.
Value
get_pcor() returns the selected partial correlation matrix.
Examples
sel_cg <- bs(multi_omics_sd_micro, p = c(4, 2), nlambda_w = 3, nlambda_b = 3,
nc = 3, verbose = FALSE)
sel_pcor <- get_pcor(sel_cg)
print(sel_pcor)
Multi-omics dataset of sleep deprivation in mouse
Description
A dataset containing transcript and metabolite values analysed in Albanese et al. 2023, subset of the multi-omics data set published in Jan, M., Gobet, N., Diessler, S. et al. A multi-omics digital research object for the genetics of sleep regulation. Sci Data 6, 258 (2019).
multi_omics_sd_small is a smaller version, limited to the transcript Cirbp
and the transcripts and metabolites belonging to its neighborhood as
described in Albanese et al. 2023
multi_omics_sd_micro is a minimal version with Cirbp and a selection of its
neighborhood.
Usage
multi_omics_sd
multi_omics_sd_small
multi_omics_sd_micro
Format
multi_omics_sd
A data frame with 30 rows and 238 variables (162 transcripts and 76 metabolites):
- Plin4 to Tfrc
log2 CPM values of 162 transcripts in mouse cortex under sleep deprivation (-4.52–10.46)
- Ala to SM C24:1
abundance values of 76 metabolites (0.02–1112.67)
multi_omics_sd_small
A data frame with 30 rows and 19 variables (14 transcripts and 5 metabolites)
- Cirbp to Stip1
log2 CPM values of 14 transcripts in mouse cortex under sleep deprivation (4.24–9.31)
- Phe to PC ae C32:2
Abundance values of 5 metabolites (0.17–145.33)
multi_omics_sd_micro
A data frame with 30 rows and 6 variables (4 transcripts and 2 metabolites)
- Cirbp to Dnajb11
log2 CPM values of 4 transcripts in mouse cortex under sleep deprivation (4.78–9.31)
- Trp to PC aa C36:3
Abundance values of 2 metabolites (58.80–145.33)
Source
Jan, M., Gobet, N., Diessler, S. et al. A multi-omics digital research object for the genetics of sleep regulation. Sci Data 6, 258 (2019) doi:10.1038/s41597-019-0171-x
Figshare folder of the original manuscript: https://figshare.com/articles/dataset/Input_data_for_systems_genetics_of_sleep_regulation/7797434
Plot selected coglasso networks
Description
plot.select_coglasso() creates an annotated plot of a coglasso selected
network from an object of S3 class select_coglasso. Variables from
different data sets will have different color coding. To plot the network,
it's enough to use plot() call on the select_coglasso object.
plot.coglasso() has the same functioning as select_coglasso.plot(), but
from an object of S3 class coglasso. In this case, it is compulsory to
specify index_c, index_lw, and index_lb.
Usage
## S3 method for class 'select_coglasso'
plot(
x,
index_c = NULL,
index_lw = NULL,
index_lb = NULL,
node_labels = TRUE,
hide_isolated = TRUE,
...
)
## S3 method for class 'coglasso'
plot(
x,
index_c,
index_lw,
index_lb,
node_labels = TRUE,
hide_isolated = TRUE,
...
)
Arguments
x |
The object of |
index_c |
The index of the |
index_lw |
The index of the |
index_lb |
The index of the |
node_labels |
Show node names in the network. Defaults to TRUE. |
hide_isolated |
Hide nodes that are not connected to any other node. Defaults to TRUE. |
... |
System required, not used here. |
Details
If the input is a coglasso object, it is necessary to specify all the
indexes to extract the chosen network.
If the input is a select_coglasso object, it extracts by default the
selected network. If the selection method was "ebic", and you want to extract
a different network than the selected one, specify all indexes.
Otherwise, if the objective is to extract the optimal network for a specific
c value different than the selected one, set index_c to your chosen
one. Also here it is possible to extract a specific non-optimal network by
setting all the indexes to the chosen ones.
Value
Returns NULL, invisibly.
See Also
get_network() to understand what it means to select a specific
network with index_c, index_lw, and index_lb.
Examples
sel_cg <- bs(multi_omics_sd_small, p = c(14, 5), nlambda_w = 15, nlambda_b = 15,
nc = 3, lambda_w_min_ratio = 0.6, verbose = FALSE)
plot(sel_cg)
Select the best coglasso network
Description
select_coglasso() selects the best combination of hyperparameters given to
coglasso() according to the selected model selection method. The three
availble options that can be set for the argument method are "xstars",
"xestars" and "ebic".
Usage
select_coglasso(
coglasso_obj,
method = "xestars",
stars_thresh = 0.1,
stars_subsample_ratio = NULL,
rep_num = 20,
max_iter = 10,
old_sampling = FALSE,
ebic_gamma = 0.5,
verbose = TRUE
)
Arguments
coglasso_obj |
The object of |
method |
The model selection method to select the best combination of hyperparameters. The available options are "xstars", "xestars" and "eBIC". Defaults to "xestars". |
stars_thresh |
The threshold set for variability of the explored
networks at each iteration of the algorithm. The |
stars_subsample_ratio |
The proportion of samples in the multi-omics
data set to be randomly subsampled to estimate the variability of the
network under the given hyperparameters setting. Defaults to 80% when the
number of samples is smaller than 144, otherwise it defaults to
|
rep_num |
The amount of subsamples of the multi-omics data set used to estimate the variability of the network under the given hyperparameters setting. Defaults to 20. |
max_iter |
The greatest number of times the algorithm is allowed to
choose a new best |
old_sampling |
Perform the same subsampling |
ebic_gamma |
The |
verbose |
Print information regarding the progress of the selection procedure on the console. |
Details
select_coglasso() provides three model selection strategies:
"xstars" uses eXtended StARS (XStARS) selecting the most stable, yet sparse network. Stability is computed upon network estimation from multiple subsamples of the multi-omics data set, allowing repetition. Subsamples are collected for a fixed amount of times (
rep_num), and with a fixed proportion of the total number of samples (stars_subsample_ratio). Seexstars()for more information on the methodology."xestars" uses eXtended Efficient StARS (XEStARS), a significantly faster version of XStARS. It could produce marginally different results to "xstars" due to a different sampling strategy. See
xestars()for more information on the methodology."ebic" uses the extended Bayesian Information Criterion (eBIC) selecting the network that minimizes it.
gammasets the wait given to the extended component, turning the model selection method to the standard BIC if set to 0.
Value
select_coglasso() returns an object of S3 class select_coglasso
containing the results of the
selection procedure, built upon an object of S3 class coglasso. Some
output elements depend on the chosen model selection method.
These elements are returned by all methods:
... are the same elements returned by
coglasso().-
sel_index_c,sel_index_lwandsel_index_lbare the indexes of the final selected parametersc,\lambda_wand\lambda_bleading to the most stable sparse network. -
sel_c,sel_lambda_wandsel_lambda_bare the final selected parametersc,\lambda_wand\lambda_bleading to the most stable sparse network. -
sel_adjis the adjacency matrix of the final selected network. -
sel_densityis the density of the final selected network. -
sel_icovis the inverse covariance matrix of the final selected network. -
sel_covoptional, given only whencoglasso()was called withcov_output = TRUE. It is the covariance matrix associated with the final selected network. -
callis the matched call. -
methodis the chosen model selection method.
These are the additional elements returned when choosing "xestars" or "xstars":
-
mergeis the "merged" adjacency matrix, the average of all the adjacency matrices estimated across all the different subsamples for the selected combination of\lambda_w,\lambda_b, andcvalues in the last path explored before convergence. Each entry is a measure of how recurrent the corresponding edge is across the subsamples. -
variability_lw,variability_lbandvariability_care numeric vectors of as many items as the number of\lambda_w,\lambda_b, andcvalues explored. Each item is the variability of the network estimated for the corresponding hyperparameter value, keeping the other two hyperparameters fixed to their selected value. -
sel_variabilityis the variability of the final selected network.
These are the additional elements returned when choosing "ebic":
-
ebic_scoresis a numerical vector containing the eBIC scores for all the hyperparameter combination.
Examples
cg <- coglasso(multi_omics_sd_micro, p = c(4, 2), nlambda_w = 3,
nlambda_b = 3, nc = 3, verbose = FALSE)
# Using eXtended Efficient StARS, takes less than five seconds
sel_cg_xestars <- select_coglasso(cg, method = "xestars", verbose = FALSE)
# Using eXtended StARS, takes around a minute
sel_cg_xstars <- select_coglasso(cg, method = "xstars", verbose = FALSE)
# Using eBIC
sel_cg_ebic <- select_coglasso(cg, method = "ebic", verbose = FALSE)
Stability selection of the best coglasso network
Description
stars_coglasso() was deprecated in favor of xstars() as the function name
does not reflect properly the method it implements. The new function
also allows new memory-lighter and faster possibilities.
Usage
stars_coglasso(
coglasso_obj,
stars_thresh = 0.1,
stars_subsample_ratio = NULL,
rep_num = 20,
max_iter = 10,
verbose = FALSE
)
Examples
cg <- coglasso(multi_omics_sd_micro, p = c(4, 2), nlambda_w = 3,
nlambda_b = 3, nc = 3, verbose = FALSE)
# Deprecated, use xstars() instead. Takes around one minute
sel_cg <- stars_coglasso(cg, verbose = FALSE)
# ->
sel_cg <- xstars(cg, verbose = FALSE)
Efficient stability selection of the best coglasso network
Description
xestars() provides a more efficient and lighter implementation
than xstars() to select the combination of hyperparameters given to
coglasso() yielding the most stable, yet sparse network. Stability is
computed upon network estimation from multiple subsamples of the multi-omics
data set, allowing repetition. Subsamples are collected for a fixed amount of
times (rep_num), and with a fixed proportion of the total number of samples
(stars_subsample_ratio).
Usage
xestars(
coglasso_obj,
stars_thresh = 0.1,
stars_subsample_ratio = NULL,
rep_num = 20,
max_iter = 10,
old_sampling = FALSE,
verbose = TRUE
)
Arguments
coglasso_obj |
The object of |
stars_thresh |
The threshold set for variability of the explored
networks at each iteration of the algorithm. The |
stars_subsample_ratio |
The proportion of samples in the multi-omics
data set to be randomly subsampled to estimate the variability of the
network under the given hyperparameters setting. Defaults to 80% when the
number of samples is smaller than 144, otherwise it defaults to
|
rep_num |
The amount of subsamples of the multi-omics data set used to estimate the variability of the network under the given hyperparameters setting. Defaults to 20. |
max_iter |
The greatest number of times the algorithm is allowed to
choose a new best |
old_sampling |
Perform the same subsampling |
verbose |
Print information regarding the progress of the selection procedure on the console. |
Details
eXtended Efficient StARS (XEStARS) is a more efficient and memory-light version of
XStARS, the adaptation for collaborative graphical regression of the method
published by Liu, H. et al. (2010): Stability Approach to Regularization
Selection (StARS). StARS was developed for network estimation regulated by
a single penalty parameter, while collaborative graphical lasso needs to
explore three different hyperparameters. These all have, to different
degree, a direct influence on network sparsity, hence on
stability. For every
iteration, xstars() explores one of the three parameters (\lambda_w,
\lambda_b, or c), keeping the other ones fixed at their previous
selected estimate, using the normal, one-dimentional StARS approach, until
finding the best combination of the three. What makes it more efficient
than xstars() is the different way that the stability check is implemented
in the two algorithms. In xstars() (and even in the original StARS), the
stability check is performed, for example, for every \lambda_w value
(or \lambda_b, or c), until all values are explored, and then it
when the algorithm selects the one yielding the most stable, yet sparse
network, and only then switching to the selection of the following
hyperparameter. In xestars(), the stability check becomes a stopping criterion.
The moment that the stability threshold is passed, the value of the
hyperparameter currently being selected is fixed, and the switch to the next
one happens immediately, without exploring the whole landscape. This reduces
sensibly the number of iterations before convergence to a final network.
The original XStARS computes a new subsampling for every time the algorithm
switches from optimizing \lambda_w, \lambda_b, or c. This
does not allow to compare the hyperparameters on an equal ground, and can
slow the selection down with bigger data set or a larger hyperparameter
space. To allow a similar subsampling to xstars(), the old_sampling
parameter has been implemented. If set to TRUE, the subsampling is similar to
the one xstars() would perform. Otherwise, the subsampling is performed at
the beginning of the algorithm once and for all its iterations.
Value
xestars() returns an object of S3 class select_coglasso
containing the results of the selection
procedure, built upon the object of S3 class coglasso returned by coglasso().
... are the same elements returned by
coglasso().-
mergeis the "merged" adjacency matrix, the average of all the adjacency matrices estimated across all the different subsamples for the selected combination of\lambda_w,\lambda_b, andcvalues in the last path explored before convergence. Each entry is a measure of how recurrent the corresponding edge is across the subsamples. -
variability_lw,variability_lbandvariability_care numeric vectors of as many items as the number of\lambda_w,\lambda_b, andcvalues explored. Each item is the variability of the network estimated for the corresponding hyperparameter value, keeping the other two hyperparameters fixed to their selected value. -
sel_index_c,sel_index_lwandsel_index_lbare the indexes of the final selected parametersc,\lambda_wand\lambda_bleading to the most stable sparse network. -
sel_c,sel_lambda_wandsel_lambda_bare the final selected parametersc,\lambda_wand\lambda_bleading to the most stable sparse network. -
sel_adjis the adjacency matrix of the final selected network. -
sel_variabilityis the variability of the final selected network. -
sel_densityis the density of the final selected network. -
sel_icovis the inverse covariance matrix of the final selected network. -
sel_covoptional, given only whencoglasso()was called withcov_output = TRUE. It is the covariance matrix associated with the final selected network. -
callis the matched call. -
methodis the chosen model selection method. Here, it is "xestars".
Examples
cg <- coglasso(multi_omics_sd_micro, p = c(4, 2), nlambda_w = 3,
nlambda_b = 3, nc = 3, verbose = FALSE)
# Takes less than five seconds
sel_cg <- xestars(cg, verbose = FALSE)
Stability selection of the best coglasso network
Description
xstars() selects the combination of hyperparameters given to
coglasso() yielding the most stable, yet sparse network. Stability is
computed upon network estimation from multiple subsamples of the multi-omics data set,
allowing repetition. Subsamples are collected for a fixed amount of times
(rep_num), and with a fixed proportion of the total number of samples
(stars_subsample_ratio).
Usage
xstars(
coglasso_obj,
stars_thresh = 0.1,
stars_subsample_ratio = NULL,
rep_num = 20,
max_iter = 10,
verbose = TRUE
)
Arguments
coglasso_obj |
The object of |
stars_thresh |
The threshold set for variability of the explored
networks at each iteration of the algorithm. The |
stars_subsample_ratio |
The proportion of samples in the multi-omics
data set to be randomly subsampled to estimate the variability of the
network under the given hyperparameters setting. Defaults to 80% when the
number of samples is smaller than 144, otherwise it defaults to
|
rep_num |
The amount of subsamples of the multi-omics data set used to estimate the variability of the network under the given hyperparameters setting. Defaults to 20. |
max_iter |
The greatest number of times the algorithm is allowed to
choose a new best |
verbose |
Print information regarding the progress of the selection procedure on the console. |
Details
eXtended StARS (XStARS) is an adaptation for collaborative graphical regression of the method
published by Liu, H. et al. (2010): Stability Approach to Regularization
Selection (StARS). StARS was developed for network estimation regulated by
a single penalty parameter, while collaborative graphical lasso needs to
explore three different hyperparameters. These all have, to a different
degree, a direct influence on network sparsity, hence on stability. For every
iteration, xstars() explores one of the three parameters (\lambda_w,
\lambda_b, or c), keeping the other ones fixed at their previous
selected estimate, using the normal, one-dimentional StARS approach, until
finding the best combination of the three that yields the most stable, yet
sparse network.
Value
xstars() returns an object of S3 class select_coglasso
containing the results of the
selection procedure, built upon the object of S3 class coglasso returned by coglasso().
... are the same elements returned by
coglasso().-
mergeis the "merged" adjacency matrix, the average of all the adjacency matrices estimated across all the different subsamples for the selected combination of\lambda_w,\lambda_b, andcvalues in the last path explored before convergence. Each entry is a measure of how recurrent the corresponding edge is across the subsamples. -
variability_lw,variability_lbandvariability_care numeric vectors of as many items as the number of\lambda_w,\lambda_b, andcvalues explored. Each item is the variability of the network estimated for the corresponding hyperparameter value, keeping the other two hyperparameters fixed to their selected value. -
sel_index_c,sel_index_lwandsel_index_lbare the indexes of the final selected parametersc,\lambda_wand\lambda_bleading to the most stable sparse network. -
sel_c,sel_lambda_wandsel_lambda_bare the final selected hyperparametersc,\lambda_wand\lambda_bleading to the most stable sparse network. -
sel_adjis the adjacency matrix of the final selected network. -
sel_variabilityis the variability of the final selected network. -
sel_densityis the density of the final selected network. -
sel_icovis the inverse covariance matrix of the final selected network. -
sel_covoptional, given only whencoglasso()was called withcov_output = TRUE. It is the covariance matrix associated with the final selected network. -
callis the matched call. -
methodis the chosen model selection method. Here, it is "xstars".
Examples
cg <- coglasso(multi_omics_sd_micro, p = c(4, 2), nlambda_w = 3,
nlambda_b = 3, nc = 3, verbose = FALSE)
# Takes around one minute
sel_cg <- xstars(cg, verbose = FALSE)