| Type: | Package |
| Title: | General Unilateral Load Estimator for Two-Layer Latent Factor Models |
| Version: | 0.1.2 |
| Description: | Implements general unilateral loading estimator for two-layer latent factor models with smooth, element-wise factor transformations. We provide data simulation, loading estimation,finite-sample error bounds, and diagnostic tools for zero-mean and sub-Gaussian assumptions. A unified interface is given for evaluating estimation accuracy and cosine similarity. The philosophy of the package is described in Guo G. (2026) <doi:10.1016/j.apm.2025.116280>. |
| License: | MIT + file LICENSE |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.2 |
| Depends: | R (≥ 3.5.0) |
| Imports: | MASS, matrixStats |
| Suggests: | testthat (≥ 3.0.0), ggplot2 |
| NeedsCompilation: | no |
| Language: | en-US |
| Author: | Guangbao Guo [aut, cre] |
| Maintainer: | Guangbao Guo <ggb11111111@163.com> |
| Packaged: | 2025-10-22 23:54:26 UTC; Administrator |
| Repository: | CRAN |
| Date/Publication: | 2025-10-28 08:10:02 UTC |
General unilateral load Estimator
Description
General unilateral load Estimator
Usage
estimate_gul_loadings(X, m)
Arguments
X |
n *p data matrix (already centred and scaled if desired). |
m |
number of latent factors (both layers). |
Details
Step 1: PCA on X to get hat_A1 Step 2: Regress X on hat_A1 to get hat_gF1 Step 3: PCA on hat_gF1 to get hat_A2 Step 4: hat_Ag = hat_A1
Value
A list with hat_A1 : p * m 1st-layer loadings hat_A2 : m * m 2nd-layer loadings hat_Ag : p * m overall loadings Sigma1 : p * p sample cov(X) (for diagnostics) Sigma2 : m * m sample cov(hat_gF1) hat_gF1 : n * m estimated transformed latent factors eig1 : eigen-values of Sigma1 eig2 : eigen-values of Sigma2
Examples
dat <- generate_gfm_data(500, 50, 5, tanh, seed = 1)
est <- estimate_gul_loadings(dat$X, m = 5)
err <- sqrt(mean((est$hat_Ag - dat$Ag)^2)) # overall RMSE
Smooth link functions compliant with Theorems 9&10
Description
Returns a vectorised map g(\cdot) and its exact Lipschitz constant
L_g for three increasingly nonlinear choices.
Usage
g_fun(type = c("linear", "weak_nonlinear", "strong_nonlinear"))
Arguments
type |
Character string selecting the map:
|
Value
Named list with components
g_fun |
vectorised function |
L_g |
scalar Lipschitz constant of |
Examples
## pick a link with L_g = 1
tmp <- g_fun("linear")
dat <- generate_gfm_data(n = 500, p = 200, m = 5, g_fun = tmp$g_fun)
est <- estimate_gul_loadings(dat$X, m = 5)
err <- norm(est$hat_Ag - dat$Ag, "F")
sprintf("F-error (L_g = %d) = %.3f", tmp$L_g, err)
Simulation wrapper for Theorems 9 & 10
Description
One Monte-Carlo replicate; returns empirical error, exceedance indicator, theoretical bounds, and assumption-check flags.
Usage
g_theorem(n, p, m, g_type, epsilon, zero_tol = 0.02)
Arguments
n |
sample size |
p |
number of observed variables |
m |
number of latent factors |
g_type |
character: "linear", "weak_nonlinear", "strong_nonlinear" |
epsilon |
error threshold |
zero_tol |
zero-mean tolerance (default 0.02) |
Value
one-row data-frame
Examples
df <- g_theorem(500, 200, 5, "linear", 0.6)
Generate general factor model with smooth latent transformation
Description
Generate general factor model with smooth latent transformation
Usage
generate_gfm_data(n, p, m, g_fun, seed = 1, sigma_V = 0.1)
Arguments
n |
Integer: sample size. |
p |
Integer: number of observed variables. |
m |
Integer: number of latent factors (both layers). |
g_fun |
Function: smooth, element-wise transformation applied to latent factors. Must be vectorised, e.g. 'sin', 'tanh', 'scale'. |
seed |
1. |
sigma_V |
Numeric: standard deviation of the idiosyncratic noise (default 0.1 => Var = 0.01). |
Value
List with components X : n * p matrix of standardised observations. A1 : p * m first-layer loading matrix. A2 : m * m second-layer loading matrix. Ag : p * m overall loading matrix (Ag = A1 F1 : n * m latent factors (before transformation). gF1: n * m latent factors (after transformation). V1 : n * p noise matrix (for diagnostics).
Examples
dat <- generate_gfm_data(200, 50, 5, g_fun = tanh)
Single-replication GUL simulation
Description
Generates one synthetic data set, estimates loadings with the GUL, and evaluates estimation accuracy.
Usage
gul_simulation(n, p, m, g_fun)
Arguments
n |
Integer: sample size. |
p |
Integer: number of observed variables. |
m |
Integer: number of latent factors (both layers). |
g_fun |
Function: element-wise, smooth transformation applied to the latent factors (e.g. 'tanh', 'sin'). |
Value
Named numeric vector with components error_F : Frobenius norm ||hat(Ag) - Ag||_F
Examples
gul_simulation(200, 50, 5, g_fun = tanh)
Multi-metric evaluation of factor loading matrix estimation error
Description
Multi-metric evaluation of factor loading matrix estimation error
Usage
loading_metrics(A_true, A_hat)
Arguments
A_true |
True loading matrix (p x m) |
A_hat |
Estimated loading matrix (p x m) |
Value
data.frame with MSE, RMSE, MAE, MaxDev, and Cosine similarity
Examples
## simulated example
p <- 100; m <- 5
Ag_true <- matrix(rnorm(p*m), p, m)
Ag_hat <- Ag_true + matrix(rnorm(p*m, 0, 0.1), p, m)
metrics <- loading_metrics(Ag_true, Ag_hat)
print(metrics)
Verify zero-mean preservation (Theorem 10 assumption 2a)
Description
Draws n i.i.d. N(0, I_m) latent factors, applies g component-wise, and checks whether |E[g(x)]| < tol on every coordinate.
Usage
verify_mean(g_fun, m = 5, n = 10000, tol = 0.001)
Arguments
g_fun |
vectorised map g: R -> R |
m |
latent dimension |
n |
Monte-Carlo sample size |
tol |
numerical tolerance (default 1e-3) |
Value
logical TRUE if |mean| < tol on all coords
Examples
tmp <- g_fun("weak_nonlinear")
verify_mean(tmp$g_fun, m = 5)
Verify sub-Gaussian preservation
Description
Draws n i.i.d. N(0, I_m) latent factors, applies g component-wise, and checks whether E[exp(g(x))] remains below an empirical cut-off. This is a quick proxy for finite sub-Gaussian norm.
Usage
verify_subgaussian(g_fun, m = 5, n = 1000, cut = exp(2))
Arguments
g_fun |
vectorised map g: R -> R |
m |
latent dimension |
n |
Monte-Carlo sample size |
cut |
empirical threshold (default exp(2) & 7.389) |
Value
logical TRUE if E[exp(g)] < cut on all coords
Examples
tmp <- g_fun("strong_nonlinear")
verify_subgaussian(tmp$g_fun, m = 5)