Type: | Package |
Title: | Lorenz and Penalized Lorenz Regressions |
Version: | 2.2.0 |
Description: | Inference for the Lorenz and penalized Lorenz regressions. More broadly, the package proposes functions to assess inequality and graphically represent it. The Lorenz Regression procedure is introduced in Heuchenne and Jacquemain (2022) <doi:10.1016/j.csda.2021.107347> and in Jacquemain, A., C. Heuchenne, and E. Pircalabelu (2024) <doi:10.1214/23-EJS2200>. |
License: | GPL-3 |
Encoding: | UTF-8 |
Depends: | R (≥ 3.3.1) |
LazyData: | true |
Imports: | stats, ggplot2, parsnip, boot, rsample, parallel, doParallel, foreach, MASS, GA, Rearrangement, progress, Rcpp (≥ 0.11.0) |
RoxygenNote: | 7.3.2 |
Suggests: | rmarkdown |
LinkingTo: | Rcpp, RcppArmadillo |
URL: | https://github.com/AlJacq/LorenzRegression |
BugReports: | https://github.com/AlJacq/LorenzRegression/issues |
ByteCompile: | true |
NeedsCompilation: | yes |
Packaged: | 2025-06-27 06:52:39 UTC; Jacquemain |
Author: | Alexandre Jacquemain
|
Maintainer: | Alexandre Jacquemain <aljacquemain@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2025-06-27 07:30:12 UTC |
LorenzRegression : A package to estimate and interpret Lorenz regressions
Description
The LorenzRegression
package proposes a toolbox to estimate, produce inference on and interpret Lorenz regressions.
As argued in Heuchenne and Jacquemain (2020) and Jacquemain, A., C. Heuchenne, and E. Pircalabelu (2024), these regressions are used to determine the explanatory power of a set of covariates on the inequality of a response variable.
In a nutshell, each variable is given a weight in order to maximize the concentration index of the response with respect to a weighted sum of the covariates.
The obtained concentration index is called the explained Gini coefficient. If a single-index model with increasing link function is assumed, the explained Gini boils down to the Gini coefficient of the fitted part of the model.
This package rests on two main functions: Lorenz.Reg
for the estimation process and Lorenz.boot
for more complete inference (tests and confidence intervals).
Details
We direct the user to Heuchenne and Jacquemain (2020) and Jacquemain, A., C. Heuchenne, and E. Pircalabelu (2024) for a rigorous exposition of the methodology and to the vignette Learning Lorenz regressions with examples for a motivational introduction of the LorenzRegression
package.
References
Heuchenne, C. and A. Jacquemain (2022). Inference for monotone single-index conditional means: A Lorenz regression approach. Computational Statistics & Data Analysis 167(C). Jacquemain, A., C. Heuchenne, and E. Pircalabelu (2024). A penalised bootstrap estimation procedure for the explained Gini coefficient. Electronic Journal of Statistics 18(1) 247-300.
Author(s)
Maintainer: Alexandre Jacquemain aljacquemain@gmail.com (ORCID)
Other contributors:
Xingjie Shi xingjieshi@njue.edu.cn (Author of an R implementation of the FABS algorithm available at https://github.com/shuanggema/Fabs, of which function Lorenz.FABS is derived) [contributor]
See Also
Useful links:
Report bugs at https://github.com/AlJacq/LorenzRegression/issues
Computes the fitness used in the GA
Description
Computes the fitness of a candidate in the genetic algorithm displayed in function Lorenz.GA.cpp
Usage
.Fitness_cpp(x, Y, X, Z, pi, tolerance)
Arguments
x |
vector of size (p-1) giving the proposed candidate, where p is the number of covariates |
Y |
vector of size n gathering the response, where n is the sample size |
X |
matrix of dimension (n*p) gathering the covariates |
Z |
vector of size n gathering iid repetitions of a U[0,1] |
pi |
vector of size n gathering the observation weights (notice that sum(pi)=1) |
tolerance |
A small positive number used to determine the threshold for considering two floating-point numbers as equal. This is primarily used to address issues with floating-point precision when comparing values that should theoretically be identical but may differ slightly due to numerical inaccuracies. |
Value
Fitness of candidate x
Computes fractional ranks
Description
Computes the vector of fractional ranks related to a given vector
Usage
.frac_rank_cpp(x, pi)
Arguments
x |
vector of size n gathering the values for which the fractional rank should be computed |
pi |
vector of size n gathering the observation weights (notice that sum(pi)=1) |
Value
Fractional rank related to vector x
Simulated income data
Description
Fictitious cross-sectional dataset used to illustrate the Lorenz regression methodology. It covers 7 variables for 200 individuals aged between 25 and 30 years.
Usage
data(Data.Incomes)
Format
A data frame with 200 rows and 7 columns:
- Income
Individual's labor income
- Sex
Sex (0=Female, 1=Male)
- Health.level
Variable ranging from 0 to 10 indicating the individual health's level (0 is worst, 10 is best)
- Age
Individual's age in years, ranging from 25 to 30
- Work.Hours
Individual's weekly work hours
- Education
Individual's highest grade completed in years
- Seniority
Length of service in years with the individual's employer
Concentration index of y with respect to x
Description
Gini.coef
computes the concentration index of a vector y with respect to another vector x.
If y and x are identical, the obtained concentration index boils down to the Gini coefficient.
Usage
Gini.coef(
y,
x = y,
na.rm = TRUE,
ties.method = c("mean", "random"),
seed = NULL,
weights = NULL
)
Arguments
y |
variable of interest. |
x |
variable to use for the ranking. By default |
na.rm |
should missing values be deleted. Default value is |
ties.method |
What method should be used to break the ties in the rank index. Possible values are "mean" (default value) or "random". If "random" is selected, the ties are broken by further ranking in terms of a uniformly distributed random variable. If "mean" is selected, the average rank method is used. |
seed |
fixes what seed is imposed for the generation of the vector of uniform random variables used to break the ties. Default is NULL, in which case no seed is imposed. |
weights |
vector of sample weights. By default, each observation is given the same weight. |
Details
The parameter seed
allows for local seed setting to control randomness in the generation of the uniform random variables.
The specified seed is applied to the respective part of the computation, and the seed is reverted to its previous state after the operation.
This ensures that the seed settings do not interfere with the global random state or other parts of the code.
Value
The value of the concentration index (or Gini coefficient)
See Also
Examples
data(Data.Incomes)
# We first compute the Gini coefficient of Income
Y <- Data.Incomes$Income
Gini.coef(y = Y)
# Then we compute the concentration index of Income with respect to Age
X <- Data.Incomes$Age
Gini.coef(y = Y, x = X)
Estimates the parameter vector in a penalized Lorenz regression with lasso penalty
Description
Lorenz.FABS
solves the penalized Lorenz regression with (adaptive) Lasso penalty on a grid of lambda values.
For each value of lambda, the function returns estimates for the vector of parameters and for the estimated explained Gini coefficient, as well as the Lorenz-R^2
of the regression.
Usage
Lorenz.FABS(
y,
x,
standardize = TRUE,
weights = NULL,
kernel = 1,
h = length(y)^(-1/5.5),
gamma = 0.05,
lambda = "Shi",
w.adaptive = NULL,
eps = 0.005,
iter = 10^4,
lambda.min = 1e-07
)
Arguments
y |
a vector of responses |
x |
a matrix of explanatory variables |
standardize |
Should the variables be standardized before the estimation process? Default value is TRUE. |
weights |
vector of sample weights. By default, each observation is given the same weight. |
kernel |
integer indicating what kernel function to use. The value 1 is the default and implies the use of an Epanechnikov kernel while the value of 2 implies the use of a biweight kernel. |
h |
bandwidth of the kernel, determining the smoothness of the approximation of the indicator function. Default value is n^(-1/5.5) where n is the sample size. |
gamma |
value of the Lagrange multiplier in the loss function |
lambda |
this parameter relates to the regularization parameter. Several options are available.
|
w.adaptive |
vector of size equal to the number of covariates where each entry indicates the weight in the adaptive Lasso. By default, each covariate is given the same weight (Lasso). |
eps |
step size in the FABS algorithm. Default value is 0.005. |
iter |
maximum number of iterations. Default value is 10^4. |
lambda.min |
lower bound of the penalty parameter. Only used if |
Details
The regression is solved using the FABS algorithm developed by Shi et al (2018) and adapted to our case. For a comprehensive explanation of the Penalized Lorenz Regression, see Jacquemain et al. In order to ensure identifiability, theta is forced to have a L2-norm equal to one.
Value
A list with several components:
lambda
vector gathering the different values of the regularization parameter
theta
matrix where column i provides the vector of estimated coefficients corresponding to the value
lambda[i]
of the regularization parameter.LR2
vector where element i provides the Lorenz-
R^2
attached to the valuelambda[i]
of the regularization parameter.Gi.expl
vector where element i provides the estimated explained Gini coefficient related to the value
lambda[i]
of the regularization parameter.
References
Jacquemain, A., C. Heuchenne, and E. Pircalabelu (2024). A penalised bootstrap estimation procedure for the explained Gini coefficient. Electronic Journal of Statistics 18(1) 247-300.
Shi, X., Y. Huang, J. Huang, and S. Ma (2018). A Forward and Backward Stagewise Algorithm for Nonconvex Loss Function with Adaptive Lasso, Computational Statistics & Data Analysis 124, 235-251.
See Also
Examples
data(Data.Incomes)
y <- Data.Incomes[,1]
x <- as.matrix(Data.Incomes[,-c(1,2)])
Lorenz.FABS(y, x)
Estimates the parameter vector in Lorenz regression using a genetic algorithm
Description
Lorenz.GA
estimates the coefficient vector of the single-index model.
It also returns the Lorenz-R^2
of the regression as well as the estimated explained Gini coefficient.
Usage
Lorenz.GA(
y,
x,
standardize = TRUE,
weights = NULL,
popSize = 50,
maxiter = 1500,
run = 150,
suggestions = NULL,
ties.method = c("random", "mean"),
ties.Gini = c("random", "mean"),
seed.random = NULL,
seed.Gini = NULL,
seed.GA = NULL,
parallel.GA = FALSE
)
Arguments
y |
a vector of responses |
x |
a matrix of explanatory variables |
standardize |
Should the variables be standardized before the estimation process? Default value is TRUE. |
weights |
vector of sample weights. By default, each observation is given the same weight. |
popSize |
Size of the population of candidates in the genetic algorithm. Default value is 50. |
maxiter |
Maximum number ot iterations in the genetic algorithm. Default value is 1500. |
run |
Number of iterations without improvement in the best fitness necessary for the algorithm to stop. Default value is 150. |
suggestions |
Initial guesses used in the genetic algorithm. The default value is |
ties.method |
What method should be used to break the ties in optimization program. Possible values are "random" (default value) or "mean". If "random" is selected, the ties are broken by further ranking in terms of a uniformly distributed random variable. If "mean" is selected, the average rank method is used. |
ties.Gini |
what method should be used to break the ties in the computation of the Gini coefficient at the end of the algorithm. Possible values and default choice are the same as above. |
seed.random |
An optional seed for generating the vector of uniform random variables used to break ties in the genetic algorithm. Defaults to |
seed.Gini |
An optional seed for generating the vector of uniform random variables used to break ties in the computation of the Gini coefficient. Defaults to |
seed.GA |
An optional seed for |
parallel.GA |
Whether parallel computing should be used to distribute the computations in the genetic algorithm. Either a logical value determining whether parallel computing is used (TRUE) or not (FALSE, the default value). Or a numerical value determining the number of cores to use. |
Details
The genetic algorithm is solved using function ga
from the GA package. The fitness function is coded in Rcpp to speed up computation time.
When discrete covariates are introduced and ties occur in the index, the default option randomly breaks them, as advised in Section 3 of Heuchenne and Jacquemain (2022)
The parameters seed.random
, seed.Gini
, and seed.GA
allow for local seed setting to control randomness in specific parts of the function.
Each seed is applied to the respective part of the computation, and the seed is reverted to its previous state after the operation.
This ensures that the seed settings do not interfere with the global random state or other parts of the code.
Value
A list with several components:
theta
the estimated vector of parameters.
LR2
the Lorenz-
R^2
of the regression.Gi.expl
the estimated explained Gini coefficient.
niter
number of iterations attained by the genetic algorithm.
fit
value attained by the fitness function at the optimum.
References
Heuchenne, C. and A. Jacquemain (2022). Inference for monotone single-index conditional means: A Lorenz regression approach. Computational Statistics & Data Analysis 167(C).
See Also
Examples
data(Data.Incomes)
y <- Data.Incomes$Income
x <- cbind(Data.Incomes$Age, Data.Incomes$Work.Hours)
Lorenz.GA(y, x, popSize = 40)
Defines the population used in the genetic algorithm
Description
Lorenz.Population
creates the initial population of the genetic algorithm used to solve the Lorenz regression.
Usage
Lorenz.Population(object)
Arguments
object |
An object of class " |
Details
Note that this population produces an initial solution ensuring a unit norm.
Value
A matrix of dimension object@popSize
times the number of explanatory variables minus one, gathering the initial population.
See Also
Fits a Lorenz regression
Description
Lorenz.Reg
fits the Lorenz regression of a response with respect to several covariates.
Usage
Lorenz.Reg(
formula,
data,
weights,
na.action,
penalty = c("none", "SCAD", "LASSO"),
grid.arg = c("h", "SCAD.nfwd", "eps", "kernel", "a", "gamma"),
grid.value = NULL,
...
)
Arguments
formula |
An object of class " |
data |
An optional data frame, list or environment (or object coercible by |
weights |
An optional vector of sample weights to be used in the fitting process. Should be |
na.action |
A function which indicates what should happen when the data contain |
penalty |
A character string specifying the type of penalty on the size of the estimated coefficients of the single-index model.
The default value is |
grid.arg |
A character string specifying the tuning parameter for which a grid is to be constructed, see Details. |
grid.value |
A numeric vector specifying the grid values, see Details. |
... |
Additional parameters corresponding to arguments passed in |
Details
In the penalized case, the model is fitted for a grid of values of two parameters : the penalty parameter (lambda) and one tuning parameter specified by the arguments grid.arg
and grid.value
.
The possibles values for grid.arg
are tuning parameters of the functions Lorenz.FABS
and Lorenz.SCADFABS
: ''h''
(the default), ''SCAD.nfwd''
,''eps''
, ''kernel''
, ''a''
and ''gamma''
.
The values for the grid are specified with grid.value
. The default is NULL
, in which case no grid is constructed
Value
An object of class "LR"
for the non-penalized Lorenz regression or of class "PLR"
for a penalized Lorenz regression.
Several methods are available for both classes to facilitate model analysis.
Use summary.LR
or summary.PLR
to summarize the model fits.
Extract the coefficients of the single-index model using coef.LR
or coef.PLR
.
Measures of explained inequality (Gini coefficient and Lorenz-R^2
) are retrieved using ineqExplained.LR
or ineqExplained.PLR
.
Obtain predictions with predict.LR
or predict.PLR
, and fitted values with fitted.LR
or fitted.PLR
.
For visual representations of explained inequality, use autoplot.LR
and plot.LR
, or autoplot.PLR
and plot.PLR
.
The object of class "LR"
is a list containing the following components:
theta
The estimated vector of parameters.
Gi.expl
The estimated explained Gini coefficient.
LR2
The Lorenz-
R^2
of the regression.
The object of class "PLR"
is a list containing the following components:
path
A list where the different elements correspond to the values of the grid parameter. Each element is a matrix where the first line displays the vector of lambda values. The second and third lines display the evolution of the Lorenz-
R^2
and explained Gini coefficient along that vector. The next lines display the evolution of the BIC score. The remaining lines display the evolution of the estimated coefficients of the single-index model.lambda.idx
the index of the optimal lambda obtained by the BIC method
grid.idx
the index of the optimal grid parameter obtained by the BIC method.
In both cases, the list also provides technical information, such as the specified formula
, weights
and call
, as well as the design matrix x
and the response vector y
.
References
Heuchenne, C. and A. Jacquemain (2022). Inference for monotone single-index conditional means: A Lorenz regression approach. Computational Statistics & Data Analysis 167(C).
Jacquemain, A., C. Heuchenne, and E. Pircalabelu (2024). A penalised bootstrap estimation procedure for the explained Gini coefficient. Electronic Journal of Statistics 18(1) 247-300.
See Also
Lorenz.GA
, Lorenz.SCADFABS
, Lorenz.FABS
, Lorenz.boot
Examples
data(Data.Incomes)
set.seed(123)
data <- Data.Incomes[sample(1:200,40),]
# 1. Non-penalized regression
NPLR <- Lorenz.Reg(Income ~ ., data = Data.Incomes, penalty = "none", popSize = 15)
# 2. Penalized regression
PLR <- Lorenz.Reg(Income ~ ., data = Data.Incomes, penalty = "SCAD",
eps = 0.06, grid.arg = "h",
grid.value=c(0.5,1,2)*nrow(Data.Incomes)^(-1/5.5))
# Print method
print(NPLR)
print(PLR)
# Summary method
summary(NPLR)
summary(PLR)
# Coef method
coef(NPLR)
coef(PLR)
# ineqExplained method
ineqExplained(NPLR)
ineqExplained(PLR)
# Predict method
## One can predict either the index or the response
predict(NPLR,type="response")
predict(PLR,type="response")
# Plot method
## The default displays the explained and observed Lorenz curve.
plot(NPLR)
plot(PLR)
## It is also possible to display a residuals plot.
plot(PLR,type="residuals")
## For PLR only, one can obtain a traceplot of the penalized coefficients
plot(PLR,type="traceplot")
Estimates the parameter vector in a penalized Lorenz regression with SCAD penalty
Description
Lorenz.SCADFABS
solves the penalized Lorenz regression with SCAD penalty on a grid of lambda values.
For each value of lambda, the function returns estimates for the vector of parameters and for the estimated explained Gini coefficient, as well as the Lorenz-R^2
of the regression.
Usage
Lorenz.SCADFABS(
y,
x,
standardize = TRUE,
weights = NULL,
kernel = 1,
h = length(y)^(-1/5.5),
gamma = 0.05,
a = 3.7,
lambda = "Shi",
eps = 0.005,
SCAD.nfwd = NULL,
iter = 10^4,
lambda.min = 1e-07
)
Arguments
y |
a vector of responses |
x |
a matrix of explanatory variables |
standardize |
Should the variables be standardized before the estimation process? Default value is TRUE. |
weights |
vector of sample weights. By default, each observation is given the same weight. |
kernel |
integer indicating what kernel function to use. The value 1 is the default and implies the use of an Epanechnikov kernel while the value of 2 implies the use of a biweight kernel. |
h |
bandwidth of the kernel, determining the smoothness of the approximation of the indicator function. Default value is n^(-1/5.5) where n is the sample size. |
gamma |
value of the Lagrange multiplier in the loss function |
a |
parameter of the SCAD penalty. Default value is 3.7. |
lambda |
this parameter relates to the regularization parameter. Several options are available.
|
eps |
step size in the FABS algorithm. Default value is 0.005. |
SCAD.nfwd |
optional tuning parameter used if penalty="SCAD". Default value is NULL. The larger the value of this parameter, the sooner the path produced by the SCAD will differ from the path produced by the LASSO. |
iter |
maximum number of iterations. Default value is 10^4. |
lambda.min |
lower bound of the penalty parameter. Only used if lambda="Shi". |
Details
The regression is solved using the SCAD-FABS algorithm developed by Jacquemain et al and adapted to our case. For a comprehensive explanation of the Penalized Lorenz Regression, see Heuchenne et al. In order to ensure identifiability, theta is forced to have a L2-norm equal to one.
Value
A list with several components:
lambda
vector gathering the different values of the regularization parameter
theta
matrix where column i provides the vector of estimated coefficients corresponding to the value
lambda[i]
of the regularization parameter.LR2
vector where element i provides the Lorenz-
R^2
attached to the valuelambda[i]
of the regularization parameter.Gi.expl
vector where element i provides the estimated explained Gini coefficient related to the value
lambda[i]
of the regularization parameter.
References
Jacquemain, A., C. Heuchenne, and E. Pircalabelu (2024). A penalised bootstrap estimation procedure for the explained Gini coefficient. Electronic Journal of Statistics 18(1) 247-300.
See Also
Examples
data(Data.Incomes)
y <- Data.Incomes[,1]
x <- as.matrix(Data.Incomes[,-c(1,2)])
Lorenz.SCADFABS(y, x)
Defines the suggestions used in the genetic algorithm
Description
Lorenz.Suggestions
creates suggestions for the genetic algorithm used to solve the Lorenz regression.
Usage
Lorenz.Suggestions(suggestions, popSize, y, x, pi, x.scale, seed)
Arguments
suggestions |
either a character string 'OLS' or a numeric matrix with at most |
popSize |
population size of the genetic algorithm |
y |
vector of responses |
x |
matrix of covariates (after standardization if applied) |
pi |
vector of normalized weights |
x.scale |
vector of standard deviations of the covariates |
seed |
seed used in the generation of the suggestions |
Value
A matrix with at most popsize
rows and with a number of columns equal to the number of explanatory variables minus one.
See Also
Bootstrap for the (penalized) Lorenz regression
Description
Lorenz.boot
performs bootstrap estimation for the vector of coefficients of the single-index model, the explained Gini coefficient, and the Lorenz-R^2
. In the penalized case, it also provides a selection method.
Usage
Lorenz.boot(
object,
R,
boot_out_only = FALSE,
store_LC = FALSE,
show_progress = TRUE,
...
)
Arguments
object |
An object of class |
R |
An integer specifying the number of bootstrap replicates. |
boot_out_only |
A logical value indicating whether the function should return only the raw bootstrap output. This advanced feature can help save computation time in specific use cases. See Details. |
store_LC |
A logical determining whether explained Lorenz curves ordinates should be stored for each bootstrap sample. The default is |
show_progress |
A logical. If |
... |
Additional arguments passed to either the bootstrap function |
Details
The function supports parallel computing in two ways:
Using the built-in parallelization options of
boot
, which can be controlled via the...
arguments such asparallel
,ncpus
, andcl
.Running multiple independent instances of
Lorenz.boot()
, each handling a subset of the bootstrap samples. In this case, settingboot_out_only = TRUE
ensures that the function only returns the raw bootstrap results. These results can be merged usingLorenz.boot.combine
.
Handling of additional arguments (...
):
The function allows for two types of arguments through ...
:
Arguments for
boot
, used to control the bootstrap procedure.Arguments for the underlying fit functions (
Lorenz.GA
,Lorenz.FABS
, orLorenz.SCADFABS
). By default, the function retrieves these parameters from the originalLorenz.Reg
call. However, users can override them by explicitly specifying new values in...
.
Value
An object of class c("LR_boot", "LR")
or c("PLR_boot", "PLR")
, depending on whether a non-penalized or penalized regression was fitted.
The methods confint.LR
and confint.PLR
can be used on objects of class "LR_boot"
or "PLR_boot"
to construct confidence intervals for the model parameters.
For the non-penalized Lorenz regression, the returned object is a list containing:
theta
The estimated vector of parameters. In the penalized case, this is a matrix where each row corresponds to a different selection method (e.g., BIC, bootstrap, cross-validation).
Gi.expl
The estimated explained Gini coefficient. In the penalized case, this is a vector, where each element corresponds to a different selection method.
LR2
The Lorenz-
R^2
of the regression. In the penalized case, this is a vector, where each element corresponds to a different selection method.boot_out
An object of class
"boot"
containing the raw bootstrap output.
For the penalized Lorenz regression, the returned object includes:
path
See
Lorenz.Reg
for the original path. The out-of-bag (OOB) score is added.lambda.idx
A vector indicating the index of the optimal lambda obtained by each selection method.
grid.idx
A vector indicating the index of the optimal grid parameter obtained by each selection method.
Note: In the penalized case, the returned object may have additional classes such as "PLR_cv"
if cross-validation was performed and used for selection.
References
Heuchenne, C. and A. Jacquemain (2022). Inference for monotone single-index conditional means: A Lorenz regression approach. Computational Statistics & Data Analysis 167(C).
Jacquemain, A., C. Heuchenne, and E. Pircalabelu (2024). A penalized bootstrap estimation procedure for the explained Gini coefficient. Electronic Journal of Statistics 18(1) 247-300.
See Also
Lorenz.Reg
, Lorenz.GA
, Lorenz.SCADFABS
, Lorenz.FABS
, PLR.CV
, boot
Examples
# Non-penalized regression example (not run due to execution time)
## Not run:
set.seed(123)
NPLR_boot <- Lorenz.boot(NPLR, R = 30)
confint(NPLR_boot) # Confidence intervals
summary(NPLR_boot)
## End(Not run)
# Penalized regression example:
set.seed(123)
PLR_boot <- Lorenz.boot(PLR, R = 20)
print(PLR_boot)
summary(PLR_boot)
coef(PLR_boot, pars.idx = "Boot")
predict(PLR_boot, pars.idx = "Boot")
plot(PLR_boot)
plot(PLR_boot, type = "diagnostic")
# Confidence intervals for different selection methods:
confint(PLR_boot, pars.idx = "BIC") # Using BIC-selected tuning parameters
confint(PLR_boot, pars.idx = "Boot") # Using bootstrap-selected tuning parameters
Combines bootstrap Lorenz regressions
Description
Lorenz.boot.combine
combine outputs of different instances of the Lorenz.boot
function.
Usage
Lorenz.boot.combine(boot_list)
Arguments
boot_list |
list of objects, each element being the output of a call to the function |
Value
An object of class c("LR_boot", "LR")
or c("PLR_boot", "PLR")
, depending on whether a non-penalized or penalized regression was fitted.
The method confint
is used on an object of class "LR_boot"
or "PLR_boot"
to obtain bootstrap inference on the model parameters.
For the non-penalized Lorenz regression, the returned object is a list containing the following components:
theta
The estimated vector of parameters. In the penalized case, it is a matrix where each row corresponds to a different selection method (e.g., BIC, bootstrap, cross-validation).
Gi.expl
The estimated explained Gini coefficient. In the penalized case, it is a vector, where each element corresponds to a different selection method.
LR2
The Lorenz-
R^2
of the regression. In the penalized case, it is a vector, where each element corresponds to a different selection method.boot_out
An object of class
"boot"
containing the output of the bootstrap calculation.
For the penalized Lorenz regression, the returned object is a list containing the following components:
path
See
Lorenz.Reg
for the original path. To this path is added the out-of-bag (OOB) score.lambda.idx
A vector indicating the index of the optimal lambda obtained by each selection method.
grid.idx
A vector indicating the index of the optimal grid parameter obtained by each selection method.
Note: The returned object may have additional classes such as "PLR_cv"
if cross-validation was performed and used as a selection method in the penalized case.
References
Heuchenne, C. and A. Jacquemain (2022). Inference for monotone single-index conditional means: A Lorenz regression approach. Computational Statistics & Data Analysis 167(C).
Jacquemain, A., C. Heuchenne, and E. Pircalabelu (2024). A penalised bootstrap estimation procedure for the explained Gini coefficient. Electronic Journal of Statistics 18(1) 247-300.
See Also
Examples
# Continuing the Lorenz.Reg(.) example for the penalized regression:
boot_list <- list()
set.seed(123)
boot_list[[1]] <- Lorenz.boot(PLR, R = 10, boot_out_only = TRUE)
set.seed(456)
boot_list[[2]] <- Lorenz.boot(PLR, R = 10, boot_out_only = TRUE)
PLR_boot <- Lorenz.boot.combine(boot_list)
summary(PLR_boot)
Concentration curve of y with respect to x
Description
Lorenz.curve
computes the concentration curve index of a vector y with respect to another vector x.
If y and x are identical, the obtained concentration curve boils down to the Lorenz curve.
Usage
Lorenz.curve(
y,
x = y,
na.rm = TRUE,
ties.method = c("mean", "random"),
seed = NULL,
weights = NULL
)
Arguments
y |
variable of interest. |
x |
variable to use for the ranking. By default |
na.rm |
should missing values be deleted. Default value is |
ties.method |
What method should be used to break the ties in the rank index. Possible values are "mean" (default value) or "random". If "random" is selected, the ties are broken by further ranking in terms of a uniformly distributed random variable. If "mean" is selected, the average rank method is used. |
seed |
seed imposed for the generation of the vector of uniform random variables used to break the ties. Default is NULL, in which case no seed is imposed. |
weights |
vector of sample weights. By default, each observation is given the same weight. |
Details
The parameter seed
allows for local seed setting to control randomness in the generation of the uniform random variables.
The specified seed is applied to the respective part of the computation, and the seed is reverted to its previous state after the operation.
This ensures that the seed settings do not interfere with the global random state or other parts of the code.
Value
A function corresponding to the estimated Lorenz or concentration curve.
See Also
Examples
data(Data.Incomes)
# We first compute the Lorenz curve of Income
Y <- Data.Incomes$Income
Lorenz.curve(y = Y)
# Then we compute the concentration curve of Income with respect to Age
X <- Data.Incomes$Age
Lorenz.curve(y = Y, x = X)
Call to the genetic algorithm for the Lorenz regression
Description
Lorenz.ga.call
encapsulates the call to ga for a local management of seed setting
Usage
Lorenz.ga.call(
ties.method,
y,
x,
pi,
V,
popSize,
maxiter,
run,
parallel.GA,
suggestions,
seed = NULL
)
Arguments
ties.method |
Either |
y |
vector of responses. |
x |
matrix of covariates. |
pi |
sample weights (normalized). |
V |
vector of uniformly distributed rvs. |
popSize |
passed to |
maxiter |
passed to |
run |
passed to |
parallel.GA |
passed to |
suggestions |
passed to |
seed |
An optional integer for setting the seed for random number generation. Default is |
Value
The fitted genetic algorithm
Graphs of concentration curves
Description
Lorenz.graphs
traces the Lorenz curve of a response and the concentration curve of the response and each of a series of covariates.
Usage
Lorenz.graphs(formula, data, difference = FALSE, ...)
Arguments
formula |
A formula object of the form response ~ other_variables. The form response ~ 1 is used to display only the Lorenz curve of the response. |
data |
A dataframe containing the variables of interest |
difference |
A logical determining whether the vertical axis should be expressed in terms of deviation from perfect equality. Default is |
... |
Further arguments (see Section 'Arguments' in |
Value
A plot comprising
The Lorenz curve of response
The concentration curves of response with respect to each element of other_variables
See Also
Examples
data(Data.Incomes)
Lorenz.graphs(Income ~ Age + Work.Hours, data = Data.Incomes)
# Expressing now the vertical axis as the deviation from perfect equality
Lorenz.graphs(Income ~ Age + Work.Hours, data = Data.Incomes, difference = TRUE)
Determines the regularization parameter (lambda) in a PLR via optimization of an information criterion.
Description
PLR.BIC
takes as input a matrix of estimated parameter vectors, where each row corresponds to a covariate and each column corresponds to a value of lambda,
and returns the index of the optimal column by optimizing an information criterion. By default the BIC is used.
Usage
PLR.BIC(y, x, theta, weights = NULL, IC = c("BIC", "AIC"))
Arguments
y |
a vector of responses |
x |
a matrix of explanatory variables |
theta |
matrix gathering the path of estimated parameter vectors. Each row corresponds to a given covariate. Each column corresponds to a given value of lambda |
weights |
vector of sample weights. By default, each observation is given the same weight. |
IC |
indicates which information criterion is used. Possibles values are "BIC" (default) or "AIC". |
Value
A list with two components
val
vector indicating the value attained by the information criterion for each value of lambda.
best
index of the value of lambda where the optimum is attained.
References
Jacquemain, A., C. Heuchenne, and E. Pircalabelu (2024). A penalised bootstrap estimation procedure for the explained Gini coefficient. Electronic Journal of Statistics 18(1) 247-300.
See Also
Lorenz.Reg
, Lorenz.FABS
, Lorenz.SCADFABS
Cross-validation for penalized Lorenz regression
Description
PLR.CV
performs k-fold cross-validation to select the grid and penalization parameters of the penalized Lorenz regression.
Usage
PLR.CV(object, k, seed.CV = NULL, parallel = FALSE, ...)
Arguments
object |
An object of class |
k |
An integer specifying the number of folds in the k-fold cross-validation. |
seed.CV |
An optional integer specifying a seed for reproducibility in the creation of the folds. Default is |
parallel |
A logical or numeric value controlling parallel computation. If |
... |
Additional arguments passed to either the cross-validation function |
Details
The parameter seed.CV
allows for local seed setting to control randomness in the generation of the folds.
The specified seed is applied to the respective part of the computation, and the seed is reverted to its previous state after the operation.
This ensures that the seed settings do not interfere with the global random state or other parts of the code.
Value
An object of class c("PLR_cv", "PLR")
.
The returned list contains the following components:
path
See
Lorenz.Reg
for the original path. The cross-validation score is added.lambda.idx
A vector indicating the index of the optimal lambda obtained by each selection method.
grid.idx
A vector indicating the index of the optimal grid parameter obtained by each selection method.
splits
A list storing the data splits used for cross-validation, as generated by
vfold_cv
.
Note: The returned object may have additional classes such as "PLR_boot"
if bootstrap was performed.
References
Jacquemain, A., C. Heuchenne, and E. Pircalabelu (2024). A penalised bootstrap estimation procedure for the explained Gini coefficient. Electronic Journal of Statistics 18(1) 247-300.
See Also
Lorenz.Reg
, Lorenz.SCADFABS
, Lorenz.FABS
, Lorenz.boot
Examples
# Continuing the Lorenz.Reg(.) example:
PLR_CV <- PLR.CV(PLR, k = 5, seed.CV = 123)
# The object now inherits from the class "PLR_cv".
# Hence the methods (also) display the results obtained by cross-validation.
print(PLR_CV)
summary(PLR_CV)
coef(PLR_CV, pars.idx = "CV")
predict(PLR_CV, pars.idx = "CV")
plot(PLR_CV)
plot(PLR_CV, type = "diagnostic") # Plot of the scores depending on the grid and penalty parameters
Penalized Lorenz Regression Fit Function
Description
PLR.fit
fits a penalized Lorenz regression model using either the LASSO or SCAD penalty.
It serves as an internal wrapper that applies the fit function over a grid of tuning parameter values.
Usage
PLR.fit(y, x, weights = NULL, penalty, grid.arg, grid.value, lambda.list, ...)
Arguments
y |
A numeric vector representing the response variable. |
x |
A numeric matrix of covariates. |
weights |
An optional numeric vector of sample weights. Default is |
penalty |
A character string specifying the penalty type. Possible values are |
grid.arg |
A character string specifying the tuning parameter for which a grid is constructed. |
grid.value |
A numeric vector specifying the grid values for |
lambda.list |
An optional list specifying penalty values ( |
... |
Additional arguments passed to |
Details
The function applies either Lorenz.FABS
(for LASSO) or Lorenz.SCADFABS
(for SCAD) for each grid value.
The best model is selected based on the BIC score.
Value
A list containing:
path
A list of matrices, where each element corresponds to a grid value. Each matrix contains lambda values, Lorenz-
R^2
, explained Gini coefficients, BIC scores, and estimated coefficients.grid.idx
The index of the optimal grid parameter selected by the BIC criterion.
lambda.idx
The index of the optimal
\lambda
selected by the BIC criterion.grid.value
The grid values used for
grid.arg
.lambda.list
A list of
\lambda
values along the solution paths.grid.arg
The tuning parameter for which the grid was constructed.
See Also
Lorenz.FABS
, Lorenz.SCADFABS
, Lorenz.boot
, Lorenz.Reg
Examples
data(Data.Incomes)
y <- Data.Incomes$Income
x <- as.matrix(Data.Incomes[,-c(1,2)])
PLR.fit(y, x, penalty = "SCAD", grid.arg = "eps", grid.value = c(0.2,0.5), lambda.list = NULL)
Re-normalizes the estimated coefficients of a penalized Lorenz regression
Description
PLR.normalize
transforms the estimated coefficients of a penalized Lorenz regression to match the model where the first category of each categorical variable is omitted.
Usage
PLR.normalize(object)
Arguments
object |
An object of S3 class |
Value
A matrix of re-normalized coefficients.
See Also
Computes Gini scores for the Penalized Lorenz Regression
Description
PLR.scores
computes the Gini scores (either OOB-scores or CV-scores) obtained for a specific validation sample and associated to a list of parameters obtained by the Penalized Lorenz Regression.
Usage
PLR.scores(y, x, weights, theta.list)
Arguments
y |
the vector of responses |
x |
the design matrix (after data management steps, i.e. standardization and transformations of the categorical covariates into binaries) |
weights |
vector of sample weights. By default, each observation is given the same weight. |
theta.list |
list of matrices. Each element of the list correspond to a value of the grid parameter. The columns of the matrices correspond to values of the penalty parameters. The rows correspond to the different covariates. |
Value
A list of vectors gathering the Gini scores. Each element of the list corresponds to a value of the grid parameter and each element of the vector corresponds to a value of the penalization parameter.
Estimates a monotonic regression curve via Chernozhukov et al (2009)
Description
Rearrangement.estimation
estimates the increasing link function of a single index model via the methodology proposed in Chernozhukov et al (2009).
Usage
Rearrangement.estimation(
y,
index,
t = index,
weights = NULL,
method = "loess",
...
)
Arguments
y |
The response variable. |
index |
The estimated index. The user may obtain it using function |
t |
A vector of points over which the link function |
weights |
A vector of sample weights. By default, each observation is given the same weight. |
method |
Either a character string specifying a smoothing method (e.g.,
The specification of a custom method is illustrated in the Examples section.
If a character string is provided, a |
... |
Additional arguments passed to the fit function defined by |
Details
A first estimator of the link function, neglecting the assumption of monotonicity, is obtained using the procedure chosen via method
.
The final estimator is obtained through the rearrangement operation explained in Chernozhukov et al (2009). This operation is carried out with function rearrangement
from package Rearrangement.
Value
A list with the following components
t
the points over which the estimation has been undertaken.
H
the estimated link function evaluated at t.
References
Chernozhukov, V., I. Fernández-Val, and A. Galichon (2009). Improving Point and Interval Estimators of Monotone Functions by Rearrangement. Biometrika 96 (3). 559–75.
See Also
Examples
data(Data.Incomes)
PLR <- Lorenz.Reg(Income ~ ., data = Data.Incomes,
penalty = "SCAD", eps = 0.01)
y <- PLR$y
index <- predict(PLR)
# Default method where the first step is obtained with loess()
Rearrangement.estimation(y = y, index = index, method = "loess")
# Custom method, where the first step is obtained with ksmooth()
# ksmooth() lacks from a separation between fitting and predicting interfaces
ksmooth_method <- list(
fit_fun = function(y, x, ...) {
list(y = y, x = x, args = list(...))
},
predict_fun = function(fit, newdata) {
if(missing(newdata)){
x.points <- fit$x
} else {
x.points <- newdata[,1]
}
o <- order(order(x.points))
yhat <- do.call(ksmooth, c(list(x = fit$x, y = fit$y, x.points = x.points), fit$args))$y
yhat[o]
}
)
Rearrangement.estimation(y = y, index = index, method = ksmooth_method, bandwidth = 0.1)
Plots for the Lorenz regression
Description
autoplot
generates a plot for an object of class "LR"
and returns it as a ggplot
object.
The plot
method is a wrapper around autoplot
that directly displays the plot,
providing a more familiar interface for users accustomed to base R plotting.
Usage
## S3 method for class 'LR'
autoplot(object, type = c("explained", "residuals"), band.level = 0.95, ...)
## S3 method for class 'LR'
plot(x, ...)
Arguments
object |
An object of class |
type |
A character string indicating the type of plot. Possible values are
|
band.level |
Confidence level for the bootstrap confidence intervals. |
... |
Additional arguments passed either to |
x |
An object of class |
Value
autoplot
returns a ggplot
object representing the desired graph. plot
directly displays this plot.
See Also
Examples
## For examples see example(Lorenz.Reg)
Plots for the penalized Lorenz regression
Description
autoplot
generates summary plots for an object of class "PLR"
and returns them as ggplot
objects.
The plot
method is a wrapper around autoplot
that directly displays the plot,
providing a more familiar interface for users accustomed to base R plotting.
Usage
## S3 method for class 'PLR'
autoplot(
object,
type = c("explained", "traceplot", "diagnostic", "residuals"),
traceplot.which = "BIC",
pars.idx = "BIC",
score.df = NULL,
band.level = 0.95,
...
)
## S3 method for class 'PLR'
plot(x, ...)
Arguments
object |
An object of class |
type |
A character string indicating the type of plot. Possible values are
|
traceplot.which |
This argument indicates the value of the grid parameter for which the traceplot should be produced (see arguments |
pars.idx |
What grid and penalty parameters should be used for parameter selection. Either a character string specifying the selection method, where the possible values are:
Or a numeric vector of length 2, where the first element is the index of the grid parameter and the second is the index of the penalty parameter. |
score.df |
A data.frame providing the scores to be displayed if |
band.level |
Confidence level for the bootstrap confidence intervals. |
... |
Additional arguments passed either to |
x |
An object of class |
Details
The available selection methods depend on the classes of the object: BIC is always available, bootstrap is available if object
inherits from "PLR_boot"
, cross-validation is available if object
inherits from "PLR_cv"
Value
autoplot
returns a ggplot
object representing the desired graph. plot
directly displays this plot.
See Also
Examples
## For examples see example(Lorenz.Reg), example(Lorenz.boot) and example(PLR.CV)
Estimated coefficients for the Lorenz regression
Description
Provides the estimated coefficients for an object of class "LR"
.
Usage
## S3 method for class 'LR'
coef(object, ...)
Arguments
object |
An object of S3 class |
... |
Additional arguments. |
Value
a vector gathering the estimated coefficients
See Also
Examples
## For examples see example(Lorenz.Reg)
Estimated coefficients for the penalized Lorenz regression
Description
Provides the estimated coefficients for an object of class "PLR"
.
Usage
## S3 method for class 'PLR'
coef(object, renormalize = TRUE, pars.idx = "BIC", ...)
Arguments
object |
An object of S3 class |
renormalize |
A logical value determining whether the coefficient vector should be re-normalized to match the representation where the first category of each categorical variable is omitted. Default value is TRUE |
pars.idx |
What grid and penalty parameters should be used for parameter selection. Either a character string specifying the selection method, where the possible values are:
Or a numeric vector of length 2, where the first element is the index of the grid parameter and the second is the index of the penalty parameter. |
... |
Additional arguments |
Value
a vector gathering the estimated coefficients.
See Also
Examples
## For examples see example(Lorenz.Reg), example(Lorenz.boot) and example(PLR.CV)
Confidence intervals for the Lorenz regression
Description
Provides bootstrap confidence intervals for the explained Gini coefficient, Lorenz-R2 and theta vector for an object of class "LR_boot"
.
Usage
## S3 method for class 'LR_boot'
confint(
object,
parm = c("Gini", "LR2", "theta"),
level = 0.95,
type = c("norm", "basic", "perc"),
bias.corr = TRUE,
...
)
Arguments
object |
An object of class |
parm |
A logical value determining whether the confidence interval is computed for the explained Gini coefficient, for the Lorenz- |
level |
A numeric giving the level of the confidence interval. Default value is 0.95. |
type |
A character string specifying the bootstrap method. Possible values are |
bias.corr |
A logical determining whether bias correction should be performed. Only used if |
... |
Additional arguments. |
Value
The desired confidence interval.
If parm="Gini"
or parm="LR2"
, the output is a vector.
If parm="theta"
, it is a matrix where each row corresponds to a different coefficient.
See Also
Examples
## For examples see example(Lorenz.boot)
Confidence intervals for the penalized Lorenz regression
Description
Provides bootstrap confidence intervals for the explained Gini coefficient and Lorenz-R^2
for an object of class "PLR_boot"
.
Usage
## S3 method for class 'PLR_boot'
confint(
object,
parm = c("Gini", "LR2"),
level = 0.95,
type = c("norm", "basic", "perc"),
pars.idx = "BIC",
bias.corr = TRUE,
...
)
Arguments
object |
An object of class |
parm |
A character string determining whether the confidence interval is computed for the explained Gini coefficient or for the Lorenz- |
level |
A numeric giving the level of the confidence interval. Default value is 0.95. |
type |
A character string specifying the bootstrap method. Possible values are |
pars.idx |
What grid and penalty parameters should be used for parameter selection. Either a character string specifying the selection method, where the possible values are:
Or a numeric vector of length 2, where the first element is the index of the grid parameter and the second is the index of the penalty parameter. |
bias.corr |
A logical determining whether bias correction should be performed. Only used if |
... |
Additional arguments. |
Value
A vector providing the desired confidence interval.
See Also
Examples
## For examples see example(Lorenz.boot)
Diagnostic for the penalized Lorenz regression
Description
diagnostic.PLR
provides diagnostic information for an object of class "PLR"
It restricts the path of the PLR to pairs of parameters (grid, lambda) that satisfy a threshold criterion.
Usage
diagnostic.PLR(
object,
tol = 0.99,
method = c("union", "intersect", "BIC", "Boot", "CV")
)
Arguments
object |
An object of class |
tol |
A numeric threshold value used to restrict the PLR path. More specifically, we restrict to pairs (grid,lambda) whose normalized score exceeds |
method |
A character string specifying the method used to evaluate the scores.
Options are
|
Value
A list with two elements:
path
The restricted model path, containing only the values of the pair (grid, lambda) that satisfy the threshold criterion.
best
The best model. It is obtained by considering the pair (grid, lambda) in the restricted path that leads to the sparsest model. If several pairs yield the same level of sparsity, we consider the pair that maximizes the minimum score across all selection methods available.
See Also
Examples
# Continuing the Lorenz.boot(.) example:
# The out-of-bag score seems to remain relatively flat when lambda is small enough
plot(PLR_boot, type = "diagnostic")
# What is the best pair (grid,penalty) parameter that is close enough to the highest OOB score
diagnostic.PLR(PLR_boot, tol = 0.99, method = "Boot")
# We want the solution to be close to the best, for both the BIC and OOB scores.
diagnostic.PLR(PLR_boot, method = "intersect")
Retrieve a measure of explained inequality from a model
Description
This generic function extracts a measure of explained inequality, such as the explained Gini coefficient or the Lorenz-R2, from a fitted model object.
Usage
ineqExplained(object, type = c("Gini.explained", "Lorenz-R2"), ...)
Arguments
object |
An object for which the inequality metrics should be extracted. |
type |
Character string specifying the type of inequality metric to retrieve. Options are |
... |
Additional arguments passed to specific methods. |
Value
The requested inequality metric.
See Also
Examples
## For examples see example(Lorenz.Reg)
Explained inequality metrics for the Lorenz regression
Description
Retrieves the explained Gini coefficient or the Lorenz-R^2
from an object of class "LR"
.
Usage
## S3 method for class 'LR'
ineqExplained(object, type = c("Gini.explained", "Lorenz-R2"), ...)
Arguments
object |
An object of S3 class |
type |
Character string specifying the type of inequality metric to retrieve. Options are |
... |
Additional arguments. |
Value
A numeric value representing the requested inequality metric.
Explained inequality metrics for the penalized Lorenz regression
Description
Retrieves the explained Gini coefficient or the Lorenz-R^2
from an object of class "PLR"
.
Usage
## S3 method for class 'PLR'
ineqExplained(
object,
type = c("Gini.explained", "Lorenz-R2"),
pars.idx = "BIC",
...
)
Arguments
object |
An object of S3 class |
type |
Character string specifying the type of inequality metric to retrieve. Options are |
pars.idx |
What grid and penalty parameters should be used for parameter selection. Either a character string specifying the selection method, where the possible values are:
Or a numeric vector of length 2, where the first element is the index of the grid parameter and the second is the index of the penalty parameter. |
... |
Additional arguments. |
Value
A numeric value representing the requested inequality metric.
Explained inequality metrics for (generalized) linear models
Description
Retrieves the explained Gini coefficient or the Lorenz-R^2
from an object of class "lm"
.
Usage
## S3 method for class 'lm'
ineqExplained(object, type = c("Gini.explained", "Lorenz-R2"), ...)
Arguments
object |
An object of S3 class |
type |
Character string specifying the type of inequality metric to retrieve. Options are |
... |
Additional arguments passed to |
Value
A numeric value representing the requested inequality metric.
Design matrix in the Penalized Lorenz Regression
Description
model_matrix_PLR
is a utilitary function that provides the design matrix for the Penalized Lorenz Regression
Usage
model_matrix_PLR(mt, mf)
Arguments
mt |
Model terms |
mf |
Model frame |
Details
This function ensures that the design matrix is constructed according to the requirements of the PLR. In PLR, one must exclude the intercept and use one-hot encoding for all variables, except when binary
Value
The design matrix
Prediction and fitted values for the Lorenz regression
Description
predict
provides predictions for an object of class "LR"
,
while fitted
extracts the fitted values.
Usage
## S3 method for class 'LR'
predict(object, newdata, type = c("index", "response"), ...)
## S3 method for class 'LR'
fitted(object, type = c("index", "response"), ...)
Arguments
object |
An object of class |
newdata |
An optional data frame in which to look for variables with which to predict. If omitted, the original data are used. |
type |
A character string indicating the type of prediction or fitted values. Possible values are |
... |
Additional arguments passed to the function |
Details
The type
argument distinguishes between two types of prediction outputs, aligned with the goals of the Lorenz regression.
When type = "index"
, the function returns the estimated index X^\top \theta
of the single-index model. This index captures the full ordering structure of the conditional expectation and is sufficient for computing the explained Gini coefficient, which is the primary focus of the method. Crucially, this estimation does not require recovering the full nonparametric link function.
When type = "response"
, the function estimates the full conditional expectation \mathbb{E}[Y | X]
by performing a second-stage estimation of the link function via Rearrangement.estimation
. This is useful if fitted or predicted response values are needed for other purposes.
Value
A vector of predictions for predict
, or a vector of fitted values for fitted
.
See Also
Lorenz.Reg
, Rearrangement.estimation
Examples
## For examples see example(Lorenz.Reg) and example(Lorenz.boot)
Prediction and fitted values for the penalized Lorenz regression
Description
predict
provides predictions for an object of class "PLR"
,
while fitted
extracts the fitted values.
Usage
## S3 method for class 'PLR'
predict(object, newdata, type = c("index", "response"), pars.idx = "BIC", ...)
## S3 method for class 'PLR'
fitted(object, type = c("index", "response"), pars.idx = "BIC", ...)
Arguments
object |
An object of S3 class |
newdata |
An optional data frame in which to look for variables with which to predict. If omitted, the original data are used. |
type |
A character string indicating the type of prediction or fitted values. Possible values are |
pars.idx |
What grid and penalty parameters should be used for parameter selection. Either a character string specifying the selection method, where the possible values are:
Or a numeric vector of length 2, where the first element is the index of the grid parameter and the second is the index of the penalty parameter. |
... |
Additional arguments passed to the function |
Details
The type
argument distinguishes between two types of prediction outputs, aligned with the goals of the penalized Lorenz regression.
When type = "index"
, the function returns the estimated index X^\top \theta
of the single-index model. This index captures the full ordering structure of the conditional expectation and is sufficient for computing the explained Gini coefficient, which is the primary focus of the method. Crucially, this estimation does not require recovering the full nonparametric link function.
When type = "response"
, the function estimates the full conditional expectation \mathbb{E}[Y | X]
by performing a second-stage estimation of the link function via Rearrangement.estimation
. This is useful if fitted or predicted response values are needed for other purposes.
Value
A vector of predictions for predict
, or a vector of fitted values for fitted
.
See Also
Lorenz.Reg
, Rearrangement.estimation
Examples
## For examples see example(Lorenz.Reg), example(Lorenz.boot) and example(PLR.CV)
Printing method for the Lorenz regression
Description
Prints the arguments, explained Gini coefficient and estimated coefficients of an object of class "LR"
.
Usage
## S3 method for class 'LR'
print(x, digits = max(3L, getOption("digits") - 3L), ...)
Arguments
x |
An object of class |
digits |
The number of significant digits to be passed. |
... |
Additional arguments. |
Value
No return value, called for printing an object of class "LR"
to the console.
See Also
Examples
## For examples see example(Lorenz.Reg)
Printing method for the penalized Lorenz regression
Description
Prints the arguments, explained Gini coefficient and estimated coefficients of an object of class "PLR"
.
Usage
## S3 method for class 'PLR'
print(x, digits = max(3L, getOption("digits") - 3L), ...)
Arguments
x |
An object of S3 class |
digits |
The number of significant digits to be passed. |
... |
Additional arguments. |
Details
The explained Gini coefficient and estimated coefficients are returned for each available selection method, depending on the class of x
.
Value
No return value, called for printing an object of class "PLR"
to the console.
See Also
Examples
## For examples see example(Lorenz.Reg), example(Lorenz.boot) and example(PLR.CV)
Printing method for the summary of a Lorenz regression
Description
Provides a printing method for an object of class "summary.LR"
.
Usage
## S3 method for class 'summary.LR'
print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'summary.LR_boot'
print(
x,
digits = max(3L, getOption("digits") - 3L),
signif.stars = getOption("show.signif.stars"),
...
)
Arguments
x |
An object of class |
digits |
Number of significant digits to be passed. |
... |
Additional arguments passed to the function |
signif.stars |
Logical determining whether p-values should be also encoded visually. See the help of the function |
Value
No return value, called for printing an object of class "LR"
to the console.
See Also
Examples
## For examples see example(Lorenz.Reg) and example(Lorenz.boot)
Printing method for the summary of a penalized Lorenz regression
Description
Provides a printing method for an object of class "summary.PLR"
.
Usage
## S3 method for class 'summary.PLR'
print(x, digits = max(3L, getOption("digits") - 3L), ...)
Arguments
x |
An object of class |
digits |
Number of significant digits to be passed. |
... |
Additional arguments passed to the function |
Value
No return value, called for printing an object of class "summary.PLR"
to the console.
See Also
Examples
## For examples see example(Lorenz.Reg), example(Lorenz.boot) and example(PLR.CV)
Residuals for the Lorenz regression
Description
residuals
provides residuals for an object of class "LR"
.
Usage
## S3 method for class 'LR'
residuals(object, ...)
Arguments
object |
An object of class |
... |
Additional arguments passed to the function |
Details
Computing residuals entail to estimate the link function of the single-index model. This is done via the function Rearrangement.estimation
.
Value
A vector of residuals.
See Also
Lorenz.Reg
, Rearrangement.estimation
Examples
## For examples see example(Lorenz.Reg) and example(Lorenz.boot)
Residuals for the penalized Lorenz regression
Description
residuals
provides residuals for an object of class "PLR"
.
Usage
## S3 method for class 'PLR'
residuals(object, pars.idx = "BIC", ...)
Arguments
object |
An object of class |
pars.idx |
What grid and penalty parameters should be used for parameter selection. Either a character string specifying the selection method, where the possible values are:
Or a numeric vector of length 2, where the first element is the index of the grid parameter and the second is the index of the penalty parameter. |
... |
Additional arguments passed to the function |
Details
Computing residuals entail to estimate the link function of the single-index model. This is done via the function Rearrangement.estimation
.
Value
A vector of residuals.
See Also
Lorenz.Reg
, Rearrangement.estimation
Examples
## For examples see example(Lorenz.Reg), example(Lorenz.boot) and example(PLR.CV)
Generates a sample of uniform random variables with a specific seed
Description
runif_seed
generates a vector of uniform(0,1) random variables with a specific seed. The seed is only used locally.
Usage
runif_seed(n, min = 0, max = 1, seed = NULL)
Arguments
n |
the sample size |
seed |
the seed to use |
Value
A vector with the generated random variables
Summary for the Lorenz regression
Description
Provides a summary for an object of class "LR"
.
Usage
## S3 method for class 'LR'
summary(object, ...)
Arguments
object |
An object of class |
... |
Additional arguments. |
Details
The inference provided in the coefficients
matrix is obtained by using the asymptotic normality and estimating the asymptotic variance via bootstrap.
Value
An object of class "summary.LR"
, containing the following elements:
call
The matched call.
ineq
A matrix with one row and three columns providing information on explained inequality. The first column gives the explained Gini coefficient, the second column gives the Gini coefficient of the response. The third column gives the Lorenz-
R^2
.coefficients
A matrix providing information on the estimated coefficients. The first column gives the estimates. If
object
inherits from"LR_boot"
, bootstrap inference was performed and the matrix contains further information. The second column is the boostrap standard error. The third column is the z-value. Finally, the last column is the p-value. In this case, the class"summary.LR_boot"
is added to the output.
See Also
Examples
## For examples see example(Lorenz.Reg) and example(Lorenz.boot)
Summary for the penalized Lorenz regression
Description
Provides a summary for an object of class "PLR"
.
Usage
## S3 method for class 'PLR'
summary(object, renormalize = TRUE, ...)
Arguments
object |
An object of class |
renormalize |
A logical value determining whether the coefficient vector should be re-normalized to match the representation where the first category of each categorical variable is omitted. Default value is TRUE |
... |
Additional arguments |
Value
An object of class "summary.PLR"
, which contains:
call
The matched call.
ineq
A table of explained inequality metrics. The columns display the explained Gini coefficient, the Gini coefficient of the response, and the Lorenz-R2. The first row contains the results obtained by BIC.
coefficients
A matrix with estimated coefficients, each row corresponding to a specific coefficient. The first column contains the results obtained by BIC.
If the object inherits from "PLR_boot"
, ineq
and coefficients
also include results from bootstrap, and the class "summary.PLR_boot"
is added to the output.
Similarly, if the object inherits from "PLR_cv"
, ineq
and coefficients
also include results from cross-validation, and the class "summary.PLR_cv"
is added to the output.
See Also
Lorenz.Reg
, Lorenz.boot
, PLR.CV
Examples
## For examples see example(Lorenz.Reg), example(Lorenz.boot) and example(PLR.CV)