Type: | Package |
Title: | Routines for L1 Estimation |
Version: | 0.60 |
Date: | 2025-07-08 |
Maintainer: | Felipe Osorio <faosorios.stat@gmail.com> |
Description: | L1 estimation for linear regression using Barrodale and Roberts' method <doi:10.1145/355616.361024> and the EM algorithm <doi:10.1023/A:1020759012226>. Estimation of mean and covariance matrix using the multivariate Laplace distribution, density, distribution function, quantile function and random number generation for univariate and multivariate Laplace distribution <doi:10.1080/03610929808832115>. Implementation of Naik and Plungpongpun <doi:10.1007/0-8176-4487-3_7> for the Generalized spatial median estimator is included. |
Depends: | R(≥ 3.5.0), fastmatrix |
LinkingTo: | fastmatrix |
Imports: | stats, grDevices, graphics |
License: | GPL-3 |
URL: | https://github.com/faosorios/L1pack |
NeedsCompilation: | yes |
LazyLoad: | yes |
Packaged: | 2025-07-08 19:41:28 UTC; root |
Author: | Felipe Osorio |
Repository: | CRAN |
Date/Publication: | 2025-07-08 20:10:02 UTC |
The symmetric Laplace distribution
Description
Density, distribution function, quantile function and random generation for the
Laplace distribution with location parameter location
and scale parameter
scale
.
Usage
dlaplace(x, location = 0, scale = 1, log = FALSE)
plaplace(q, location = 0, scale = 1, lower.tail = TRUE, log.p = FALSE)
qlaplace(p, location = 0, scale = 1, lower.tail = TRUE, log.p = FALSE)
rlaplace(n, location = 0, scale = 1)
Arguments
x , q |
vector of quantiles. |
p |
vector of probabilities. |
n |
number of observations. |
location |
location parameter |
scale |
scale parameter |
log , log.p |
logical; if TRUE, probabilities |
lower.tail |
logical; if TRUE (default), probabilities are |
Details
If location
or scale
are not specified, they assume the default
values of 0
and 1
respectively.
The Laplace distribution with location \mu
and scale \phi
has density
f(x) =
\frac{1}{\sqrt{2}\phi} \exp(-\sqrt{2}|x-\mu|/\phi),
where -\infty < y < \infty
, -\infty < \mu < \infty
and \phi > 0
.
The mean is \mu
and the variance is \phi^2
.
The cumulative distribution function, assumes the form
F(x) =
\left\{\begin{array}{ll}
\frac{1}{2} \exp(\sqrt{2}(x - \mu)/\phi) & x < \mu, \\
1 - \frac{1}{2} \exp(-\sqrt{2}(x - \mu)/\phi) & x \geq \mu.
\end{array}\right.
The quantile function, is given by
F^{-1}(p) = \left\{\begin{array}{ll}
\mu + \frac{\phi}{\sqrt{2}} \log(2p) & p < 0.5, \\
\mu - \frac{\phi}{\sqrt{2}} \log(2(1-p)) & p \geq 0.5.
\end{array}\right.
Value
dlaplace
, plaplace
, and qlaplace
are respectively the density,
distribution function and quantile function of the Laplace distribution. rlaplace
generates random deviates drawn from the Laplace distribution, the length of the result
is determined by n
.
Author(s)
Felipe Osorio and Tymoteusz Wolodzko
References
Kotz, S., Kozubowski, T.J., Podgorski, K. (2001). The Laplace Distributions and Generalizations. Birkhauser, Boston.
Krishnamoorthy, K. (2006). Handbook of Statistical Distributions with Applications, 2nd Ed. Chapman & Hall, Boca Raton.
See Also
Distributions for other standard distributions and rmLaplace
for the random generation from the multivariate Laplace distribution.
Examples
x <- rlaplace(1000)
## QQ-plot for Laplace data against true theoretical distribution:
qqplot(qlaplace(ppoints(1000)), x, main = "Laplace QQ-plot",
xlab = "Theoretical quantiles", ylab = "Sample quantiles")
abline(c(0,1), col = "red", lwd = 2)
Estimation of location and scatter using the multivariate Laplace distribution
Description
Estimates the location vector and scatter matrix assuming the data came from a multivariate Laplace distribution.
Usage
LaplaceFit(x, data, subset, na.action, tol = 1e-6, maxiter = 200)
Arguments
x |
a formula or a numeric matrix or an object that can be coerced to a numeric matrix. |
data |
an optional data frame (or similar: see |
subset |
an optional expression indicating the subset of the rows of data that should be used in the fitting process. |
na.action |
a function that indicates what should happen when the data contain NAs. |
tol |
the relative tolerance in the iterative algorithm. |
maxiter |
maximum number of iterations. The default is 200. |
Value
A list with class 'LaplaceFit'
containing the following components:
call |
a list containing an image of the |
center |
final estimate of the location vector. |
Scatter |
final estimate of the scale matrix. |
logLik |
the log-likelihood at convergence. |
numIter |
the number of iterations used in the iterative algorithm. |
weights |
estimated weights corresponding to the Laplace distribution. |
distances |
estimated squared Mahalanobis distances. |
Generic function print
show the results of the fit.
References
Yavuz, F.G., Arslan, O. (2018). Linear mixed model with Laplace distribution (LLMM). Statistical Papers 59, 271-289.
See Also
Examples
fit <- LaplaceFit(stack.x)
fit
# covariance matrix
p <- fit$dims[2]
Sigma <- (4 * (p + 1)) * fit$Scatter
Sigma
Wilson-Hilferty transformation
Description
Returns the Wilson-Hilferty transformation for multivariate Laplace deviates.
Usage
WH.Laplace(x, center, Scatter)
Arguments
x |
object of class |
center |
mean vector of the distribution or data vector of length |
Scatter |
Scatter matrix ( |
Details
Let T = D/(2p)
be a Gamma distributed random variable, where D^2
denotes the squared Mahalanobis distance defined as
D^2 = (\bold{x} - \bold{\mu})^T \bold{\Sigma}^{-1} (\bold{x} - \bold{\mu}).
Thus, the Wilson-Hilferty transformation is given by
z = \frac{T^{1/3} - (1 - \frac{1}{9p})}{(\frac{1}{9p})^{1/2}}%
and z
is approximately distributed as a standard normal distribution. This is useful,
for instance, in the construction of QQ-plots.
References
Osorio, F., Galea, M., Henriquez, C., Arellano-Valle, R. (2023). Addressing non-normality in multivariate analysis using the t-distribution. AStA Advances in Statistical Analysis 107, 785-813.
Terrell, G.R. (2003). The Wilson-Hilferty transformation is locally saddlepoint. Biometrika 90, 445-453.
Wilson, E.B., Hilferty, M.M. (1931). The distribution of chi-square. Proceedings of the National Academy of Sciences of the United States of America 17, 684-688.
Examples
Scatter <- matrix(c(1,.5,.5,1), ncol = 2)
Scatter
# generate the sample
y <- rmLaplace(n = 500, Scatter = Scatter)
fit <- LaplaceFit(y)
z <- WH.Laplace(fit)
par(pty = "s")
qqnorm(z, main = "Transformed distances QQ-plot")
abline(c(0,1), col = "red", lwd = 2)
Confidence intervals from lad
models
Description
Computes confidence intervals for one or more parameters in a fitted
model associated to a lad
object.
Usage
## S3 method for class 'lad'
confint(object, parm, level = 0.95, ...)
Arguments
object |
a fitted model object. |
parm |
a specification of which parameters are to be given confidence intervals, either a vector of numbers or a vector of names. If missing, all parameters are considered. |
level |
the confidence level required. |
... |
additional argument(s) for related methods. |
Details
confint
is a generic function. Confidence intervals associated to lad
objects are asymptotic, and needs suitable coef
and vcov
methods to be available.
Value
A matrix (or vector) with columns giving lower and upper confidence limits for
each parameter. These will be labelled as (1-level)/2
and 1 - (1-level)/2
in % (by default 2.5% and 97.5%).
See Also
confint.glm
and
confint.nls
in package MASS.
Examples
fm <- lad(stack.loss ~ ., data = stackloss, method = "BR")
confint(fm) # based on asymptotic normality
QQ-plot with simulated envelopes
Description
Constructs a normal QQ-plot using a Wilson-Hilferty transformation for the estimated Mahalanobis distances obtained from the LaplaceFit
procedure.
Usage
envelope.Laplace(object, reps = 50, conf = 0.95, plot.it = TRUE)
Arguments
object |
an object of class |
reps |
number of simulated point patterns to be generated when computing the envelopes. The default number is 50. |
conf |
the confidence level of the envelopes required. The default is to find 95% confidence envelopes. |
plot.it |
if |
Value
A list with the following components :
transformed |
a vector with the |
envelope |
a matrix with two columns corresponding to the values of the lower and upper pointwise confidence envelope. |
References
Atkinson, A.C. (1985). Plots, Transformations and Regression. Oxford University Press, Oxford.
Osorio, F., Galea, M., Henriquez, C., Arellano-Valle, R. (2023). Addressing non-normality in multivariate analysis using the t-distribution. AStA Advances in Statistical Analysis 107 785-813.
Vallejos, R., Osorio, F., Ferrer, C. (2025+). A new coefficient to measure agreement between continuous variables. Working paper.
Excess returns for Martin Marietta and American Can companies
Description
Data from the Martin Marietta and American Can companies collected over a period of 5 years on a monthly basis.
Usage
data(ereturns)
Format
A data frame with 60 observations on the following 4 variables.
- Date
the month in which the observations were collected.
- am.can
excess returns from the American Can company.
- m.marietta
excess returns from the Martin Marietta company.
- CRSP
an index for the excess rate returns for the New York stock exchange.
Source
Butler, R.J., McDonald, J.B., Nelson, R.D., and White, S.B. (1990). Robust and partially adaptive estimation of regression models. The Review of Economics and Statistics 72, 321-327.
L1 concordance correlation coefficient
Description
Calculates L1 concordance correlation coefficient for evaluating the degree of agreement between measurements generated by two different methods.
Usage
l1ccc(x, data, equal.means = FALSE, boots = TRUE, nsamples = 1000, subset, na.action)
Arguments
x |
a formula or a numeric matrix or an object that can be coerced to a numeric matrix. |
data |
an optional data frame (or similar: see |
equal.means |
logical, should the means of the measuring devices be considered equal? In which case the restricted estimation is carried out under this assumption. |
boots |
logical, hould use bootstrap to approximate the variances of the L1 estimators. |
nsamples |
number of bootstrap samples (default to 1000), only used if |
subset |
an optional expression indicating the subset of the rows of data that should be used in the fitting process. |
na.action |
a function that indicates what should happen when the data contain NAs. |
Value
A list with class 'l1ccc'
containing the following named components:
call |
a list containing an image of the |
x |
|
rho1 |
L1 estimate of the concordance correlation coefficient. |
var.rho1 |
approximate variance of the L1 concordance correlation coefficient,
only present if |
L1 |
list with L1 estimates using Laplace, and normal distributions, and U-statistics. |
Lin |
Lin's concordance correlation coefficient under the Laplace distribution. |
ustat |
L1 estimation of concordance coefficient using U-statistics,
a list containing the following elements |
center |
the estimated mean vector. |
Scatter |
the estimated Scatter (or Scale) matrix. |
logLik |
the log-likelihood at convergence. |
weights |
estimated weights corresponding to the Laplace distribution. |
Restricted |
available only if |
References
King, T.S., Chinchilli, V.M. (2001). A generalized concordance correlation coefficient for continuous and categorical data. Statistics in Medicine 20, 2131-2147.
King, T.S., Chinchilli, V.M. (2001). Robust estimators of the concordance correlation coefficient. Journal of Biopharmaceutical Statistics 11, 83-105.
Lin, L. (1989). A concordance correlation coefficient to evaluate reproducibility. Biometrics 45, 255-268.
Vallejos, R., Osorio, F., Ferrer, C. (2025+). A new coefficient to measure agreement between two continuous variables. Working paper.
Examples
## data from Bland and Altman (1986). The Lancet 327, 307-310.
x <- list(Large = c(494,395,516,434,476,557,413,442,650,433,
417,656,267,478,178,423,427),
Mini = c(512,430,520,428,500,600,364,380,658,445,
432,626,260,477,259,350,451))
x <- as.data.frame(x)
plot(Mini ~ Large, data = x, xlim = c(100,800), ylim = c(100,800),
xlab = "PERF by Large meter", ylab = "PERF by Mini meter")
abline(c(0,1), col = "gray", lwd = 2)
## estimating L1 concordance coefficient
z <- l1ccc(~ Mini + Large, data = x, boots = FALSE)
z
## output:
# Call:
# l1ccc(x = ~ Mini + Large, data = x, boots = FALSE)
#
# L1 coefficients using:
# Laplace Gaussian U-statistic
# 0.7456 0.7607 0.7642
#
# Lin's coefficients:
# estimate accuracy precision
# 0.9395 0.9974 0.9419
Minimum absolute residual (L1) regression
Description
Performs an L1 regression on a matrix of explanatory variables and a vector of responses.
Usage
l1fit(x, y, intercept = TRUE, tolerance = 1e-07, print.it = TRUE)
Arguments
x |
vector or matrix of explanatory variables. Each row corresponds to an
observation and each column to a variable. The number of rows of |
y |
numeric vector containing the response. Missing values are not allowed. |
intercept |
logical flag. If |
tolerance |
numerical value used to test for singularity in the regression. |
print.it |
logical flag. If |
Details
The Barrodale-Roberts algorithm, which is a specialized linear programming algorithm, is used.
Value
list defining the regression (compare with function lsfit
).
coefficients |
vector of coefficients. |
residuals |
residuals from the fit. |
message |
character strings stating whether a non-unique solution is possible,
or if the |
References
Barrodale, I., Roberts, F.D.K. (1973). An improved algorithm for discrete L1 linear approximations. SIAM Journal of Numerical Analysis 10, 839-848.
Barrodale, I., Roberts, F.D.K. (1974). Solution of an overdetermined system of equations in the L1 norm. Communications of the ACM 17, 319-320.
Bloomfield, P., Steiger, W.L. (1983). Least Absolute Deviations: Theory, Applications, and Algorithms. Birkhauser, Boston, Mass.
Examples
l1fit(stack.x, stack.loss)
Least absolute deviations regression
Description
This function is used to fit linear models considering Laplace errors.
Usage
lad(formula, data, subset, na.action, method = "BR", tol = 1e-7, maxiter = 200,
x = FALSE, y = FALSE, contrasts = NULL)
Arguments
formula |
an object of class |
data |
an optional data frame containing the variables in the model. If
not found in |
subset |
an optional expression indicating the subset of the rows of data that should be used in the fit. |
na.action |
a function that indicates what should happen when the data contain NAs. |
method |
character string specifying the fitting method to be used; the options
are |
tol |
the relative tolerance for the iterative algorithm. Default is |
maxiter |
The maximum number of iterations for the |
x , y |
logicals. If |
contrasts |
an optional list. See the |
Value
An object of class lad
representing the linear model fit. Generic
function print
, show the results of the fit.
The functions print
and summary
are used to obtain and print a summary
of the results. The generic accessor functions coefficients
, fitted.values
and residuals
extract various useful features of the value returned by lad
.
Author(s)
The design was inspired by the R function lm
.
References
Barrodale, I., Roberts, F.D.K. (1974). Solution of an overdetermined system of equations in the L1 norm. Communications of the ACM 17, 319-320.
Phillips, R.F. (2002). Least absolute deviations estimation via the EM algorithm. Statistics and Computing 12, 281-285.
Examples
fm <- lad(stack.loss ~ ., data = stackloss, method = "BR")
summary(fm)
data(ereturns)
fm <- lad(m.marietta ~ CRSP, data = ereturns, method = "EM")
summary(fm)
# basic observations
fm$basic
Fitter functions for least absolute deviation (LAD) regression
Description
This function is a switcher among various numerical fitting functions
(lad.fit.BR
, and lad.fit.EM
). The argument method
does the switching: "BR"
for lad.fit.BR
, etc. This should
usually not be used directly unless by experienced users.
Usage
lad.fit(x, y, method = "BR", tol = 1e-7, maxiter = 200)
Arguments
x |
design matrix of dimension |
y |
vector of observations of length |
method |
currently, methods |
tol |
the relative tolerance for the iterative algorithm. Default is |
maxiter |
The maximum number of iterations for the |
Value
a list
with components:
coefficients |
a named vector of coefficients. |
scale |
final scale estimate of the random error. |
residuals |
the residuals, that is response minus fitted values. |
fitted.values |
the fitted values. |
SAD |
the sum of absolute deviations. |
weights |
estimated |
basic |
basic observations, that is observations with zero residuals. |
logLik |
the log-likelihood at convergence. |
See Also
Examples
x <- cbind(1, stack.x)
fm <- lad.fit(x, stack.loss, method = "BR")
fm
Fit a least absolute deviation (LAD) regression model
Description
Fits a linear model using LAD methods, returning the bare minimum computations.
Usage
lad.fit.BR(x, y, tol = 1e-7)
lad.fit.EM(x, y, tol = 1e-7, maxiter = 200)
Arguments
x , y |
numeric vectors or matrices for the predictors and the response in
a linear model. Typically, but not necessarily, |
tol |
the relative tolerance for the iterative algorithm. Default is |
maxiter |
The maximum number of iterations for the |
Value
The bare bones of a lad
object: the coefficients, residuals, fitted values,
and some information used by summary.lad
.
See Also
Examples
x <- cbind(1, stack.x)
z <- lad.fit.BR(x, stack.loss)
z
Multivariate Laplace distribution
Description
These functions provide the density and random number generation from the multivariate Laplace distribution.
Usage
dmLaplace(x, center = rep(0, nrow(Scatter)), Scatter = diag(length(center)), log = FALSE)
rmLaplace(n = 1, center = rep(0, nrow(Scatter)), Scatter = diag(length(center)))
Arguments
x |
vector or matrix of data. |
n |
the number of samples requested. |
center |
a |
Scatter |
a |
log |
logical; if TRUE, the logarithm of the density function is returned. |
Details
The multivariate Laplace distribution is a multidimensional extension of the univariate symmetric Laplace distribution. There are multiple forms of the multivariate Laplace distribution. Here, a particular case of the multivarite power exponential distribution introduced by Gomez et al. (1998) is considered.
The multivariate Laplace distribution with location \bold{\mu}
= center
and \bold{\Sigma}
= Scatter
has density
f(\bold{x}) =
\frac{\Gamma(k/2)}{\pi^{k/2}\Gamma(k)2^{k+1}}|\bold{\Sigma}|^{-1/2}
\exp\left\{-\frac{1}{2}[(\bold{x} - \bold{\mu})^T \bold{\Sigma}^{-1} (\bold{x}
- \bold{\mu})]^{1/2}\right\}.
The function rmLaplace
is an interface to C routines, which make calls to
subroutines from LAPACK. The matrix decomposition is internally done using
the Cholesky decomposition. If Scatter
is not non-negative definite then
there will be a warning message.
Value
If x
is a matrix with n
rows, then dmLaplace
returns a
n\times 1
vector considering each row of x
as a copy from
the multivariate Laplace.
If n = 1
, then rmLaplace
returns a vector of the same length as
center
, otherwise a matrix of n
rows of random vectors.
References
Fang, K.T., Kotz, S., Ng, K.W. (1990). Symmetric Multivariate and Related Distributions. Chapman & Hall, London.
Gomez, E., Gomez-Villegas, M.A., Marin, J.M. (1998). A multivariate generalization of the power exponential family of distributions. Communications in Statistics - Theory and Methods 27, 589-600.
Examples
# dispersion parameters
Scatter <- matrix(c(1,.5,.5,1), ncol = 2)
Scatter
# generate the sample
y <- rmLaplace(n = 2000, Scatter = Scatter)
# scatterplot of a random bivariate Laplace sample with center
# vector zero and scale matrix 'Scatter'
par(pty = "s")
plot(y, xlab = "", ylab = "")
title("bivariate Laplace sample", font.main = 1)
Simulate responses from lad
models
Description
Simulate one or more responses from the distribution corresponding to a
fitted lad
object.
Usage
## S3 method for class 'lad'
simulate(object, nsim = 1, seed = NULL, ...)
Arguments
object |
an object representing a fitted model. |
nsim |
number of response vectors to simulate. Defaults to 1. |
seed |
an object specifying if and how the random number generator should
be initialized (‘seeded’). For the "lad" method, either |
... |
additional optional arguments. |
Value
For the "lad"
method, the result is a data frame with an attribute
"seed"
. If argument seed
is NULL
, the attribute is the
value of .Random.seed
before the simulation was started.
Author(s)
Tymoteusz Wolodzko and Felipe Osorio
Examples
fm <- lad(stack.loss ~ ., data = stackloss)
sm <- simulate(fm, nsim = 4)
Computation of the generalized spatial median
Description
Computation of the generalized spatial median estimator as defined by Rao (1988).
Usage
spatial.median(x, data, subset, na.action, tol = 1e-6, maxiter = 200)
Arguments
x |
a formula or a numeric matrix or an object that can be coerced to a numeric matrix. |
data |
an optional data frame (or similar: see |
subset |
an optional expression indicating the subset of the rows of data that should be used in the fitting process. |
na.action |
a function that indicates what should happen when the data contain NAs. |
tol |
the relative tolerance in the iterative algorithm. |
maxiter |
maximum number of iterations. The default is 200. |
Details
An interesting fact is that the generalized spatial median estimator proposed by Rao (1988) is
the maximum likelihood estimator under the Kotz-type distribution discussed by Naik and Plungpongpun (2006).
The generalized spatial median estimators are defined as \hat{\bold{\mu}}
and \hat{\bold{\Sigma}}
which minimize
\frac{n}{2}\log|\bold{\Sigma}| + \sum\limits_{i=1}^n \sqrt{(\bold{x} - \bold{\mu})^T
\bold{\Sigma}^{-1} (\bold{x} - \bold{\mu})},
simultaneously with respect to \bold{\mu}
and \bold{\Sigma}
.
The function spatial.median
follows the iterative reweighting algorithm of Naik and Plungpongpun (2006).
Value
A list with class 'spatial.median'
containing the following components:
call |
a list containing an image of the |
median |
final estimate of the location vector. |
Scatter |
final estimate of the scale matrix. |
logLik |
the log-likelihood at convergence. |
numIter |
the number of iterations used in the iterative algorithm. |
innerIter |
the total number of iterations used in the inner iterative algorithm. |
weights |
estimated weights corresponding to the Kotz distribution. |
distances |
estimated squared Mahalanobis distances. |
Generic function print
show the results of the fit.
References
Naik, D.N., Plungpongpun, K. (2006). A Kotz-type distribution for multivariate statistical inference. In: Balakrishnan, N., Sarabia, J.M., Castillo, E. (Eds) Advances in Distribution Theory, Order Statistics, and Inference. Birkhauser Boston, pp. 111-124.
Rao, C.R. (1988). Methodology based on the L1-norm in statistical inference. Sankhya, Series A 50, 289-313.
See Also
Examples
z <- spatial.median(stack.x)
z
Calculate variance-covariance matrix from lad
models
Description
Returns the variance-covariance matrix of the main parameters of a fitted model
for lad
objects. The “main” parameters of model correspond to those
returned by coef
, and typically do not contain the nuisance scale
parameter.
Usage
## S3 method for class 'lad'
vcov(object, ...)
Arguments
object |
an object representing a fitted model. |
... |
additional arguments for method functions. |
Value
A matrix of the estimated covariances between the parameter estimates in the
linear regression model. This should have row and column names corresponding
to the parameter names given by the coef
method.