Help for package FKSUM

Type:

Package

Title:

Fast Kernel Sums

Version:

1.0.1

Author:

David P. Hofmeyr

Maintainer:

David P. Hofmeyr <dhofmeyr@sun.ac.za>

Description:

Implements the method of Hofmeyr, D.P. (2021) <doi:10.1109/TPAMI.2019.2930501> for fast evaluation of univariate kernel smoothers based on recursive computations. Applications to the basic problems of density and regression function estimation are provided, as well as some projection pursuit methods for which the objective is based on non-parametric functionals of the projected density, or conditional density of a response given projected covariates. The package is accompanied by an instructive paper in the Journal of Statistical Software <doi:10.18637/jss.v101.i03>.

License:

GPL-2 | GPL-3 [expanded from: GPL]

Encoding:

UTF-8

Depends:

Rcpp (≥ 0.12.16)

Imports:

rARPACK, MASS, Matrix

LinkingTo:

Rcpp, RcppArmadillo

NeedsCompilation:

yes

Packaged:

2023-04-15 16:01:51 UTC; david

Repository:

CRAN

Date/Publication:

2023-04-15 16:20:02 UTC

Fast Exact Kernel Smoothing

Description

Uses recursive expressions to compute exact univariate kernel smoothers in log-linear time based on the method described by Hofmeyr (2021). The main general purpose function is fk_sum() which computes exact (or binned approximations of) weighted sums of kernels, or their derivatives. Standard smoothing problems such as density estimation and regression can be addressed directly using this function, or using the purpose-built functions fk_density() and fk_regression. Projection pursuit algorithms based on minimum entropy (ICA, Hyvarinen and Oja 2000), minimum density cluster separation (MDH, Pavlidis et al. 2016) and regression-type losses (PPR, Friedman and Stuetzle 1981) are implemented in the functions fk_ICA(), fk_ppr() and fk_mdh() respectively. The package is accompanied by an instructive paper in the Journal of Statistical Software (see below for reference).

Details

Package: FKSUM

Type: Package

Title: Fast Kernel Sums

Version: 0.1.3

Depends: Rcpp (>= 0.12.16)

License: GPL-3

LazyData: yes

Imports: rARPACK, MASS

LinkingTo: Rcpp, RcppArmadillo

Author(s)

David Hofmeyr[aut, cre]

Maintainer: David P. Hofmeyr <dhofmeyr@sun.ac.za>

References

Hofmeyr, D.P. (2022) "Fast kernel smoothing in R with applications to projection pursuit", Journal of Statistical Software, 101(3), 1-33.

Hofmeyr, D.P. (2021) "Fast exact evaluation of univariate kernel sums", IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(2), 447-458.

Hyvarinen, A., and Oja, E. (2000) "Independent component analysis: algorithms and applications." Neural networks, 13(4), 411–430.

Friedman, J.H., and Stuetzle, W. (1981) "Projection pursuit regression." Journal of the American statistical Association, 76(376), 817–823.

Pavlidis, N.G., Hofmeyr, D.P., and Tasoulis, S.K. (2016) "Minimum density hyperplanes." Journal of Machine Learning Research, 17(156), 1–33.

Compute discrete bin weights

Description

Computes total weight from an initial weight vector falling into regular bins, based on corresponding location of sample points. Not intended for general use.

Allocation of points to bins

Description

Finds the bin into which each of a set of sample points falls. Not intended for alternative use.

Gradient of projection index for independent component analysis.

Description

Computes the gradient of the objective evaluating sample entropy of a data set projected onto a given vector. Only used within optimisation of projection. Not intended for alternative use.

Gradient of the projection index for projection pursuit regression

Description

Computes the gradient of the loss of a smoother fit to projected covariates and corresponding response values. Only used within optimisation of projection. Not intended for alternative use.

Kernel derivative sums

Description

Fast computation of kernel derivative sums. Called by multiple functions exported by the package, but not intended for alternative use.

Projection index for independent component analysis.

Description

Computes the sample entropy of a data set projected onto a given vector. Only used within optimisation of projection. Not intended for alternative use.

Projection index for projection pursuit regression

Description

Computes the loss of a smoother fit to projected covariates and corresponding response values. Only used within optimisation of projection. Not intended for alternative use.

Initialisation for PPR based on Ridge LM after GAM type smoothing

Description

Computes an initial projection for PPR by replacing each covariate by the smoothed response against it. Then fit a standard ridge estimator to the transformed data. Not intended for general use.

Independent component analysis with sample entropy estimated via kernel density

Description

Performs minimum entropy projection pursuit

Usage

fk_ICA(X, ncomp = 1, beta = c(.25, .25), hmult = 1.5, it = 20, nbin = NULL)

Arguments

X

numeric data matrix (num_data x num_dimensions).

ncomp

integer number of indpenedent components to extract.

beta

numeric vector of kernel coefficients. See Hofmeyr (2019) for details. The default is the smooth order one kernel described in the paper.

hmult

positive numeric. The bandwidth in the kernel density is set to hmult multiplied by Silverman's rule of thumb value, which is based on the AMISE minimiser when the underlying distribution is Gaussian.

it

integer maximum number of iterations. The default is 20.

nbin

integer number of bins if binning estimator is to be used. The default is to compute the exact entropy estimate from the kde.

Value

A named list with fields

$X

the data matrix given as argument.

$K

the pre-whitening matrix.

$W

the estimated un-mixing matrix.

$S

the estimated source matrix, S = sweep(X,2,colMeans(X),'-')%*%K%*%W.

References

Hofmeyr, D.P. (2021) "Fast exact evaluation of univariate kernel sums", IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(2), 447-458.

Examples

op <- par(no.readonly = TRUE)

set.seed(1)

### Generate a set of data with 5 non-Gaussian components
### and 20 noise (Gaussian) components. The non-Gaussian
### components are cubed Gaussian, exponential, uniform,
### t3 and bimodal Gaussian mixture.

errdim = 10
ndat = 2000
X = cbind(rnorm(ndat)^3/5, rexp(ndat), runif(ndat)*sqrt(12)*2, rt(ndat, 3),
    c(rnorm(ndat/2), rexp(ndat/2)+2), matrix(rnorm(errdim*ndat), ncol = errdim))

### Generate a random mixing matrix and mixed data XX

R = matrix(rnorm((5+errdim)*(5+errdim)), ncol = 5+errdim)
XX = X%*%R

### Apply fk_ICA to XX to extract components.
### Note that ordering of extracted components is
### in some sense arbitrary w.r.t. their generation

model <- fk_ICA(XX, ncol(XX))

par(mfrow = c(1, 5))

for(i in 1:5) plot(density(model$S[,i]))

par(op)

Nadaraya-Watson regression estimator

Description

Called by the function fk_regression(). Not intended for alternative use.

Fast univariate kernel density estimation

Description

Uses recursive formulation for kernel sums as described in Hofmeyr (2021) to evaluate kernel estimate of the density exactly. Binning approximation also available for faster computation if needed. Default is exact evaluation on a grid, but evaluation at an arbitrary collection of points is possible.

Usage

fk_density(x, h = 'Silverman', h_adjust = 1, beta = NULL, from = NULL,
              to = NULL, ngrid = 1000, nbin = NULL, x_eval = NULL)

Arguments

x

vector of sample points.

h

(optional) bandwidth to be used in estimate. Can be either positive numeric, or one of "Silverman" for Silverman's rule of thumb (Silverman, 1986) or "mlcv" for maximum pseudo-likelihood cross validation bandwidth. Default is Silverman's heuristic. Exact "mlcv" will be time consuming for samples of more than millions of points. Binning approximation for "mlcv" bandwidth needs at least 10000 bins for reasonable accuracy, and even more if density has very sharp features.

h_adjust

(optional) positive numeric. Final bandwidth will be h*h_adjust. Default value is 1. R's base function density uses Silverman's heuristic with h_adjust approximately 0.85.

beta

(optional) numeric vector of kernel coefficients. See Hofmeyr (2019) for details. The default is the smooth order one kernel described in the paper.

from

(optional) lower end of evaluation interval if evaluation on a grid is desired. Default is min(x)-6*h

to

(optional) upper end of evaluation interval if evaluation on a grid is desired. Default is max(x)+6*h

ngrid

(optional) integer number of grid points for evaluation. Default is 1000.

nbin

(optional) integer number of bins if binning estimator is to be used. The default is to compute the exact density on a grid of 1000 points.

x_eval

(optional) vector of evaluation points. The default if both ngrid and nbin are set to NULL is evaluation at the sample points themselves. If another specific set of points is required then ngrid must be set to null and x_eval supplied. Evaluation at arbitrary x_eval using binned approximation is also possible, in which case nbin and x_eval must both be supplied.

Value

A named list with fields

$x

the vector of points at which the density is estimated.

$y

the estimated density values.

$h

the value of the bandwidth.

References

Hofmeyr, D.P. (2021) "Fast exact evaluation of univariate kernel sums", IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(2), 447-458.

Silverman, B. (1986) Density estimation for statistics and data analysis, volume 26. CRC press.

Examples

op <- par(no.readonly = TRUE)

set.seed(1)

### generate a bimodal Gaussian mixture sample with 100000 points

n1 <- rbinom(1, 100000, .5)

x <- c(rnorm(n1), rnorm(100000-n1)/4+2)



# --------- Example 1: Grid evaluation --------------#
### evaluate exact and binned approximation on a
### grid and plot along with true density. We can
### use both Silverman's heuristic and the "mlcv" estimate.

xs <- seq(-5, 3.5, length = 1000)

ftrue <- dnorm(xs)/2 + dnorm(xs, 2, 1/4)/2

fhat <- fk_density(x, from = -5, to = 3.5)

fhat_bin <- fk_density(x, nbin = 1000, from = -5, to = 3.5)

par(mfrow = c(2, 2))
plot(xs, ftrue, type = 'l', lwd = 4, col = rgb(.7, .7, .7),
    xlab = 'x', ylab = 'f(x)',
    main = 'Exact evaluation, Silverman bandwidth')
lines(fhat, lty = 2)

plot(xs, ftrue, type = 'l', lwd = 4, col = rgb(.7, .7, .7),
    xlab = 'x', ylab = 'f(x)',
    main = 'Binned approximation, Silverman bandwidth')
lines(fhat_bin, lty = 2)




fhat <- fk_density(x, from = -5, to = 3.5, h = 'mlcv')

fhat_bin <- fk_density(x, nbin = 1000, from = -5,
    to = 3.5, h = 'mlcv')

plot(xs, ftrue, type = 'l', lwd = 4, col = rgb(.7, .7, .7),
    xlab = 'x', ylab = 'f(x)',
    main = 'Exact evaluation, MLCV bandwidth')
lines(fhat, lty = 2)

plot(xs, ftrue, type = 'l', lwd = 4, col = rgb(.7, .7, .7),
    xlab = 'x', ylab = 'f(x)',
    main = 'Binned approximation, MLCV bandwidth')
lines(fhat_bin, lty = 2)


par(op)


# --------- Example 2: Evaluation at sample --------------#
### evaluate exact and binned approximation at the sample.
### Note that the output will be in the order of the original
### sample, and also the number of points will be large. It is
### not advisable, therefore, to simply plot these. We instead
### compute the mean squared deviations from the true density

ftrue <- sapply(x, function(xi) dnorm(xi)/2 + dnorm(xi, 2, 1/4)/2)

fhat <- fk_density(x, ngrid = NULL)

fhat_bin <- fk_density(x, nbin = 1000, x_eval = x)


mean((ftrue-fhat$y)^2)

mean((ftrue-fhat_bin$y)^2)


### now for MLCV bandwidth

fhat <- fk_density(x, h = 'mlcv', ngrid = NULL)

fhat_bin <- fk_density(x, nbin = 1000,
    h = 'mlcv', x_eval = x)


mean((ftrue-fhat$y)^2)

mean((ftrue-fhat_bin$y)^2)

Gradient of projection index for finding minimum density hyperplanes

Description

Computes the gradient of the objective based on minimum integrated density of a hyperplane orthogonal to a given projection vector. Only used within optimisation of projection. Not intended for alternative use.

Projection index for finding minimum density hyperplanes

Description

Computes the minimum integrated density orthogonal to a given projection vector. Only used within optimisation of projection. Not intended for alternative use.

Check if MDH constraints are active

Description

Called during processing of fk_mdh(). Not intended for alternative use.

Local linear regression estimator

Description

Called by the function fk_regression(). Not intended for alternative use.

C++ code for evaluating mimimum density hyperplane from projected data

Description

Called during processing of fk_mdh(). Not intended for alternative use.

Minimum density hyperplane orthogonal to a vector

Description

Finds the location of the minimum density hyperplane orthogonal to a given vector. Not intended for alternative use.

C++ code for evaluating partial gradient of mimimum density hyperplane w.r.t. projected data

Description

Called during processing of fk_mdh(). Not intended for alternative use.

Minimum density hyperplanes

Description

Estimates minimum density hyperplanes for clustering using projection pursuit, based on the method of Pavlidis et al. (2016)

Usage

fk_mdh(X, v0 = NULL, hmult = 1, beta = c(.25,.25), alphamax = 1)

Arguments

X

numeric data matrix (num_data x num_dimensions).

v0

(optional) vector representing the initial projection direction. Default is the first principal direction.

hmult

(optional) positive numeric. The bandwidth in the kernel density is set to hmult multiplied by Silverman's rule of thumb value, which is based on the AMISE minimiser when the underlying distribution is Gaussian.

beta

(optional) numeric vector of kernel coefficients. The default is the smooth order one kernel described by Hofmeyr (2019).

alphamax

(optional) maximum/final (scaled) distance of the optimal hyperplane from the mean of the data. The default is alphamax = 1.

Value

A named list with fields

$v

the optimal projection vector.

$b

the location of the minimum density hyperplane orthogonal to v.

References

Pavlidis N.G., Hofmeyr D.P., Tasoulis S.K. (2016) "Minimum Density Hyperplanes", Journal of Machine Learning Research, 17(156), 1–33.

Hofmeyr, D.P. (2021) "Fast exact evaluation of univariate kernel sums", IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(2), 447-458.

Examples

op <- par(no.readonly = TRUE)

set.seed(1)

### Generate data from a simple 10 component mixture model in 10 dimensions:
### Determine means of components

mu <- matrix(runif(100), ncol = 10)

### Determine scales of components (diagonal elements of covariance)

covs <- matrix(rexp(100), ncol = 10)/10

### Determine cluster indicator matrix

I <- t(rmultinom(2000, 1, 1:10))

### Determine mean and residual matrix

M <- I%*%mu

R <- matrix(rnorm(20000), 2000, 10)*(I%*%covs)

### Data is given by the sum of these

X <- M + R

### Find the minimum density hyperplane separator and plot
### the projected data as well as the PCA plot for comparison

mdh <- fk_mdh(X)

par(mfrow = c(2, 2))

plot(X%*%mdh$v, X%*%eigen(cov(X))$vectors[,2], xlab = 'mdh vector',
    ylab = '2nd principal component', main = 'MDH solution')
abline(v = mdh$b, col = 2)
plot(X%*%eigen(cov(X))$vectors[,1:2], xlab = '1st principal component',
    ylab = '2nd principal component', main = 'PCA plot')

plot(fk_density(X%*%mdh$v))
abline(v = mdh$b, col = 2)
plot(fk_density(X%*%eigen(cov(X))$vectors[,1]))

par(op)

Projection pursuit regression with local linear kernel smoother

Description

Generates a projection pursuit regression model for covariates X and response y

Usage

fk_ppr(X, y, nterms=1, hmult=1, betas=NULL, loss=NULL,
    dloss=NULL, initialisation="lm", type = "loc-lin")

Arguments

X

a numeric matrix (num_data x num_dimensions) of covariates.

y

a numeric vector of responses.

nterms

The number of terms in the additive regression model. The default is a single term.

betas

The coefficients in the expression of the kernel. See Hofmeyr (2019) for details. The default is the smooth order one kernel described in the paper.

hmult

The bandwidth in the kernel smoother is set to eigen(cov(X))values[1]^.5/nrow(X)^.2*hmult during projection pursuit. The final value is based on leave-one-out cross validation on the optimal projection.

loss

The (additive) loss function to be used. Allows for generalised responses by specifying an appropriate likelihood/deviance function for the loss. Note: loss is to be minimised, so deviance or negative log-likelihood would be appropriate. If specified then must be a function of y and hy (the fitted values, yhat), returning a vector of "errors"" which are added as the total loss. The default is the squared error.

dloss

The derivative of the loss function. Also takes arguments y and hy, and returns the vector of partial derivatives of the loss w.r.t. hy.

initialisation

Method use to initialise the projection vectors. Must be one of "lm" and "random", or a function taking only arguments X and y, and returning a vector of length ncol(X). The default is "lm", which is a simple linear model with a small ridge to ensure a solution. "random" uses random initialisation.

type

The type of regression estimator. Must be one of "loc-lin" for local linear regression, or "NW" for the Nadaray Watson, or local constant regression. The default is local linear regression.

Value

A named list with class fk_ppr containing

$mu

The estimated (global) mean of the response.

$mu_X

The vector of estimated means of the covariates.

$y

The responses given as argument to the function.

$X

The covariates given as argument to the function.

$hs

A vector of bandwidths used for each term in the model.

$vs

A matrix whose rows are the projection vectors.

$fitted

The fitted values on each projection. Note that these are based on the residuals used to fit that component of the model, and not the original y values. $fitted is used for prediction on test data.

$beta

The beta coefficients in the kernel formulation.

References

Friedman, J., and Stuetzle, W. (1981) "Projection pursuit regression." Journal of the American statistical Association 76.376.

Hofmeyr, D.P. (2021) "Fast exact evaluation of univariate kernel sums", IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(2), 447-458.

Examples

op <- par(no.readonly = TRUE)

set.seed(2)

### Generate a set of data

X = matrix(rnorm(10000), ncol = 10)

### Generate some "true" projection vectors

beta1 = (runif(10)>.5)*rnorm(10)
beta2 = (runif(10)>.5)*rnorm(10)

### Generate responses, dependent on X%*%beta1 and X%*%beta2

y = 1 + X%*%beta1 + ((X%*%beta2)>2)*(X%*%beta2-2)*10
y = y + (X%*%beta1)*(X%*%beta2)/5 + rnorm(1000)

### Fit a PPR model with 2 terms on a sample of the data

train_ids = sample(1:1000, 500)

model = fk_ppr(X[train_ids,], y[train_ids], nterms = 2)

### Predict on left out data, and compute
### estimated coefficient of determination

yhat = predict(model, X[-train_ids,])

MSE = mean((yhat-y[-train_ids])^2)
Var = mean((y[-train_ids]-mean(y[-train_ids]))^2)

1-MSE/Var


#################################################

### Add some "outliers" in the training data and fit
### the model again, as well as one with an absolute loss

y[train_ids] = y[train_ids] + (runif(500)<.05)*(rnorm(500)*100)

model1 <- fk_ppr(X[train_ids,], y[train_ids], nterms = 2)

model2 <- fk_ppr(X[train_ids,], y[train_ids], nterms = 2,
    loss = function(y, hy) abs(y-hy),
    dloss = function(y, hy) sign(hy-y))

### Plot the resulting components in the model on the test data

par(mar = c(2, 2, 2, 2))
par(mfrow = c(2, 2))

plot(X[-train_ids,]%*%model1$vs[1,], y[-train_ids])
plot(X[-train_ids,]%*%model1$vs[2,], y[-train_ids])
plot(X[-train_ids,]%*%model2$vs[1,], y[-train_ids])
plot(X[-train_ids,]%*%model2$vs[2,], y[-train_ids])

par(op)

### Estimate comparative estimated coefficients of determination

MSE1 = mean((predict(model1, X[-train_ids,])-y[-train_ids])^2)
MSE2 = mean((predict(model2, X[-train_ids,])-y[-train_ids])^2)
Var = mean((y[-train_ids]-mean(y[-train_ids]))^2)


1-MSE1/Var
1-MSE2/Var

Fast univariate kernel regression

Description

Uses recursive formulation for kernel sums as described in Hofmeyr (2021) for exact evaluation of kernel regression estimate. Both local linear and Nadaraya-Watson (Watson, 1964; Nadaraya, 1964) estimators are available. Binning approximation also available for faster computation if needed. Default is exact evaluation on a grid, but evaluation at sample points is also possible.

Usage

fk_regression(x, y, h = 'amise', beta = NULL, from = NULL,
    to = NULL, ngrid = 1000, nbin = NULL, type = 'loc-lin')

Arguments

x

vector of covariates.

y

vector of response values.

h

(optional) bandwidth to be used in estimate. Can be either positive numeric, or one of "amise" for a rough estimate of the asymptotic mean integrated square error minimiser, or "cv" for leave-one-out cross validation error minimiser based on squared error. Default is "amise". Cross validation not available for binning approximation.

beta

(optional) numeric vector of kernel coefficients. See Hofmeyr (2019) for details. The default is the smooth order one kernel described in the paper.

from

(optional) lower end of evaluation interval if evaluation on a grid is desired. Default is min(x)

to

(optional) upper end of evaluation interval if evaluation on a grid is desired. Default is max(x)

ngrid

(optional) integer number of grid points for evaluation. Default is 1000.

nbin

(optional) integer number of bins if binning estimator is to be used. The default is to compute the exact density on a grid of 1000 points.

type

(optional) one of "loc-lin" and "NW" if local linear or Nadaraya-Watson is desired, respectively. Default is local linear estimator.

Value

A named list with fields

$x

the vector of points at which the regression function is estimated.

$y

the estimated function values.

$h

the value of the bandwidth.

References

Hofmeyr, D.P. (2021) "Fast exact evaluation of univariate kernel sums", IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(2), 447-458.

Nadaraya, E.A. (1964) "On estimating regression." Theory of Probability and Its Applications,9(1), 141–142.

Watson, G.S. (1964) "Smooth regression analysis." Sankhya: The Indian Journal of Statistics, Series A, pp. 359–372.

Examples

op <- par(no.readonly = TRUE)

set.seed(1)

n <- 2000
x <- rbeta(n, 2, 2) * 10
y <- 3 * sin(2 * x) + 10 * (x > 5) * (x - 5)
y <- y + rnorm(n) + (rgamma(n, 2, 2) - 1) * (abs(x - 5) + 3)

xs <- seq(min(x), max(x), length = 1000)
ftrue <- 3 * sin(2 * xs) + 10 * (xs > 5) * (xs - 5)

fhat_loc_lin <- fk_regression(x, y)
fhat_NW <- fk_regression(x, y, type = 'NW')

par(mfrow = c(2, 2))
plot(fhat_loc_lin, main = 'Local linear estimator with amise bandwidth')

plot(fhat_NW, main = 'NW estimator with amise bandwidth')


fhat_loc_lin <- fk_regression(x, y, h = 'cv')
fhat_NW <- fk_regression(x, y, type = 'NW', h = 'cv')

plot(fhat_loc_lin, main = 'Local linear estimator with cv bandwidth')

plot(fhat_NW, main = 'NW estimator with cv bandwidth')

par(op)

Fast Exact Kernel Sum Evaluation

Description

Computes exact (and binned approximations of) kernel and kernel derivative sums with arbitrary weights/coefficients. Computation is based on the method of Hofmeyr (2021).

Usage

fk_sum(x, omega, h, x_eval = NULL, beta = c(.25,.25),
    nbin = NULL, type = "ksum")

Arguments

x

numeric vector of sample points.

omega

numeric vector of weights.

h

numeric bandwidth (must be strictly positive).

x_eval

vector of evaluation points. Default is to evaluate at the sample points themselves.

beta

numeric vector of kernel coefficients. Default is c(.25, .25); the smooth order 1 kernel.

nbin

integer number of bins for binning approximation. Default is to compute the exact sums.

type

one of "ksum": returns the kernel sums, "dksum": returns the kernel derivative sums and "both": returns a matrix cbind(ksum, dksum).

Value

A vector (if type%in%c("ksum","dksum")) of kernel sums, or kernel derivative sums. A matrix (if type == "both") with kernel sums in its first column and kernel derivative sums in its second column.

References

Hofmeyr, D.P. (2021) "Fast exact evaluation of univariate kernel sums", IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(2), 447-458.

Examples

### Compute density estimates directly with
### kernel sums and constant normalising weights

set.seed(1)
n <- 150000
num_Gauss <- rbinom(1, n, 2 / 3)
x <- c(rnorm(num_Gauss), rexp(n - num_Gauss) + 1)

hs <- seq(.025, .1, length = 5)
xeval <- seq(-4, 8, length = 1000)
ftrue <- 2 / 3 * dnorm(xeval) + 1 / 3 * dexp(xeval - 1)
plot(xeval, ftrue, lwd = 6, col = rgb(.8, .8, .8), xlab = "x",
  ylab = "f(x)", type = "l")

for(i in 1:5) lines(xeval, fk_sum(x, rep(1 / hs[i] / n, n), hs[i],
    x_eval = xeval), lty = i)

Bandwidth conversion from Gaussian

Description

A naive and simple, but useful way of transforming a bandwidth for use with the Gaussian kernel to an appropriate value to be used with a kernel implemented in the FKSUM package. The transformation if based on the relative values for the kernels for the AMISE minimal bandwidth for the purpose of density estimation. The transformation has been useful for other problems as well.

Usage

h_Gauss_to_K(h, beta)

Arguments

h

positive numeric bandwidth value appropriate for use with the Gaussian kernel.

beta

numeric vector of kernel coefficients.

Value

positive numeric value, the bandwidth to be used for estimation using the kernel with coefficients beta.

References

Hofmeyr, D.P. (2021) "Fast exact evaluation of univariate kernel sums", IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(2), 447-458.

Examples

### Use existing bandwidth selection with the package

### generate sample from bimodal Gaussian mixture with varying scale

x <- c(rnorm(100000), rnorm(100000)/4 + 2)

### estimate bandwidth using bw.SJ

bwsj <- h_Gauss_to_K(bw.SJ(x), c(.25,.25))

xs <- seq(-3, 4, length = 1000)
plot(xs, dnorm(xs)/2+dnorm(xs,2,1/4)/2, type = 'l',
    col = rgb(.7, .7, .7), lwd = 4)

lines(fk_density(x, h = bwsj), lty = 2, lwd = 2)

Bandwidth conversion to Gaussian

Description

A naive and simple, but useful way of transforming a bandwidth for use with a kernel implemented in FKSUM to an appropriate value to be used with the Gaussian kernel. The transformation if based on the relative values for the kernels for the AMISE minimal bandwidth for the purpose of density estimation. The transformation has been useful for other problems as well.

Usage

h_K_to_Gauss(h, beta)

Arguments

h

positive numeric bandwidth value appropriate for use with the FKSUM kernel with coefficients beta to an appropriate value for use with the Gaussian kernel.

beta

numeric vector of kernel coefficients.

Value

positive numeric value, the bandwidth to be used for estimation using the Gaussian kernel.

References

Hofmeyr, D.P. (2021) "Fast exact evaluation of univariate kernel sums", IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(2), 447-458.

Examples

### Use the package for data driven bandwidth for use with
### the Gaussian kernel in existing implementations

### generate sample from bimodal Gaussian mixture with varying scale

x <- c(rnorm(100000), rnorm(100000)/4 + 2)

### estimate bandwidth with the package using MLCV and convert

bwml <- h_K_to_Gauss(fk_density(x, h = 'mlcv',
    beta = c(.25,.25))$h, c(.25,.25))

bwml_binned <- h_K_to_Gauss(fk_density(x, h = 'mlcv',
    beta = c(.25,.25), nbin = 10000)$h, c(.25,.25))

xs <- seq(-3, 4, length = 1000)
plot(xs, dnorm(xs)/2+dnorm(xs,2,1/4)/2, type = 'l',
    lwd = 4, col = rgb(.7, .7, .7))

lines(density(x, bw = bwml), lty = 2, lwd = 2)

lines(density(x, bw = bwml_binned), lty = 3, lwd = 2)

Leave-one-out regression smoother

Description

Called during processing of fk_ppr(). Not intended for alternative use.

Kernel and kernel derivative sums

Description

Fast computation of both kernel sums and kernel derivative sums. Called by multiple functions exported by the package, but not intended for alternative use.

Kernel sums

Description

Fast computation of kernel sums. Called by multiple functions exported by the package, but not intended for alternative use.

The L2 norm of a kernel

Description

Computes the L2 norm of a kernel implemented in FKSUM, based on its coefficients. NB: the coefficients will first be normalised so that the kernel represents a density function. Equivalent to sqrt(roughness_K())

Usage

norm_K(beta)

Arguments

beta

numeric vector of positive coefficients.

Value

positive numeric representing the L2 norm of the kernel with coefficients beta/norm_const(beta).

References

Hofmeyr, D.P. (2021) "Fast exact evaluation of univariate kernel sums", IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(2), 447-458.

Normalising constant for kernels in FKSUM

Description

Computes the normalising constant for kernels implemented in the FKSUM package, to convert kernel with arbitrary beta coefficients to one which is a probability density. That is the kernel with coefficients equal to beta/norm_const_beta(beta) has unit integral

Usage

norm_const_K(beta)

Arguments

beta

numeric vector of positive coefficients.

Value

positive numeric normalising constant

References

Hofmeyr, D.P. (2021) "Fast exact evaluation of univariate kernel sums", IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(2), 447-458.

Plot method for class fk_ICA

Description

Plot method for Independent Component Analysis model obtained with function fk_ICA()

Usage

## S3 method for class 'fk_ICA'
plot(x, ...)

Arguments

x

an object of class fk_ICA, output from the function of the same name.

...

(optional) further plotting parameters.

Plot method for class fk_density

Description

Plot method for kernel density estimate obtained with function fk_density()

Usage

## S3 method for class 'fk_density'
plot(x, main = NULL, ...)

Arguments

x

an object of class fk_density, output from the function of the same name.

main

(optional) main title for plot.

...

(optional) further plotting parameters.

Plot method for class fk_mdh

Description

Plot method for minimum density hyperplane solution obtained with function fk_mdh()

Usage

## S3 method for class 'fk_mdh'
plot(x, ...)

Arguments

x

an object of class fk_mdh, output from the function of the same name.

...

(optional) further plotting parameters.

Plot method for class fk_ppr

Description

Plot method for projection pursuit regression model fit with function fk_ppr()

Usage

## S3 method for class 'fk_ppr'
plot(x, term = NULL, ...)

Arguments

x

an object of class fk_ppr, output from the function of the same name.

term

(optional) the term in the additive regression model to be visualised. If omitted then the residuals and fitted values for the full model are shown.

...

(optional) further arguments passed to or from other methods.

Plot method for class fk_regression

Description

Plot method for kernel regression estimate obtained with function fk_regression()

Usage

## S3 method for class 'fk_regression'
plot(x, main = NULL, ...)

Arguments

x

an object of class fk_regression, output from the function of the same name.

main

(optional) main title for plot.

...

(optional) further plotting parameters.

Plot the shape of a kernel function implemented in FKSUM based on its vector of beta coefficients

Description

Plots the kernel function for a given set of beta coefficients. NB: coefficients will be normalised so that the kernel describes a probability density.

Usage

plot_kernel(beta, type = 'l', ...)

Arguments

beta

positive numeric vector of kernel coefficients.

type

(optional) plot type, as in base plot() function. The default is a line plot.

...

(optional) any additional arguments accepted by base plot() function.

References

Hofmeyr, D.P. (2021) "Fast exact evaluation of univariate kernel sums", IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(2), 447-458.

Examples

### Plot order 4 smooth kernel

plot_kernel(1/factorial(0:4))

### Use a different line style

plot_kernel(1/factorial(0:4), lty = 2)

Predict method for class fk_ppr

Description

Standard prediction method for regression models, specific to outputs from the fk_ppr() function. See help(fk_ppr) for more details on the model.

Usage

## S3 method for class 'fk_ppr'
predict(object, Xtest = NULL, ...)

Arguments

object

an object of class fk_ppr, output from the function of the same name.

Xtest

(optional) matrix of test data on which predictions are to be made. If omitted then fitted values from training data are returned.

...

(optional) further arguments passed to or from other methods.

Value

A vector of predictions for Xtest.

Examples

op <- par(no.readonly = TRUE)

set.seed(2)

### Generate a set of data

X = matrix(rnorm(10000), ncol = 10)

### Generate some "true" projection vectors

beta1 = (runif(10)>.5)*rnorm(10)
beta2 = (runif(10)>.5)*rnorm(10)

### Generate responses, dependent on X%*%beta1 and X%*%beta2

y = 1 + X%*%beta1 + ((X%*%beta2)>2)*(X%*%beta2-2)*10
y = y + (X%*%beta1)*(X%*%beta2)/5 + rnorm(1000)

### Fit a PPR model with 2 terms on a sample of the data

train_ids = sample(1:1000, 500)

model = fk_ppr(X[train_ids,], y[train_ids], nterms = 2)

### Predict on left out data, and compute
### estimated coefficient of determination

yhat = predict(model, X[-train_ids,])

MSE = mean((yhat-y[-train_ids])^2)
Var = mean((y[-train_ids]-mean(y[-train_ids]))^2)

1-MSE/Var


#################################################

### Add some "outliers" in the training data and fit
### the model again, as well as one with an absolute loss

y[train_ids] = y[train_ids] + (runif(500)<.05)*(rnorm(500)*100)

model1 <- fk_ppr(X[train_ids,], y[train_ids], nterms = 2)

model2 <- fk_ppr(X[train_ids,], y[train_ids], nterms = 2,
    loss = function(y, hy) abs(y-hy),
    dloss = function(y, hy) sign(hy-y))

### Plot the resulting components in the model on the test data

par(mar = c(2, 2, 2, 2))
par(mfrow = c(2, 2))

plot(X[-train_ids,]%*%model1$vs[1,], y[-train_ids])
plot(X[-train_ids,]%*%model1$vs[2,], y[-train_ids])
plot(X[-train_ids,]%*%model2$vs[1,], y[-train_ids])
plot(X[-train_ids,]%*%model2$vs[2,], y[-train_ids])

par(op)

### estimate comparative estimated coefficients of determination

MSE1 = mean((predict(model1, X[-train_ids,])-y[-train_ids])^2)
MSE2 = mean((predict(model2, X[-train_ids,])-y[-train_ids])^2)
Var = mean((y[-train_ids]-mean(y[-train_ids]))^2)


1-MSE1/Var
1-MSE2/Var

Predict method for class fk_regression

Description

Predict method for kernel regression. Evaluates the fitted regression function at a set of evaluation/test points

Usage

## S3 method for class 'fk_regression'
predict(object, xtest = NULL, ...)

Arguments

object

an object of class fk_regression, output from the function of the same name.

xtest

(optional) vector of evaluation/test points. If omitted then the fitted values at the original sample are returned.

...

(optional) further arguments passed to or from other methods.

Value

vector of fitted/predicted values of the regression function.

Print method for class fk_ICA

Description

Print method for Independent Component Analysis model fit with function fk_ICA()

Usage

## S3 method for class 'fk_ICA'
print(x, ...)

Arguments

x

an object of class fk_ICA, output from the function of the same name.

...

(optional) further arguments passed to or from other methods.

Print method for class fk_density

Description

Print method for kernel density estimate obtained with function fk_density()

Usage

## S3 method for class 'fk_density'
print(x, ...)

Arguments

x

an object of class fk_density, output from the function of the same name.

...

(optional) further arguments passed to or from other methods.

Print method for class fk_mdh

Description

Print method for minimum density hyperplane solution obtained with function fk_mdh()

Usage

## S3 method for class 'fk_mdh'
print(x, ...)

Arguments

x

an object of class fk_mdh, output from the function of the same name.

...

(optional) further arguments passed to or from other methods.

Print method for class fk_ppr

Description

Print method for projection pursuit regression model fit with function fk_ppr()

Usage

## S3 method for class 'fk_ppr'
print(x, ...)

Arguments

x

an object of class fk_ppr, output from the function of the same name.

...

(optional) further arguments passed to or from other methods.

Print method for class fk_regression

Description

Print method for kernel regression estimate obtained with function fk_regression()

Usage

## S3 method for class 'fk_regression'
print(x, ...)

Arguments

x

an object of class fk_regression, output from the function of the same name.

...

(optional) further arguments passed to or from other methods.

Kernel roughness

Description

Computes the squared L2 norm, also known as the roughness, of a kernel implemented in FKSUM based on its beta coefficients. NB: the coefficients will first be normalised so that the kernel represents a density function. Equivalent to norm_K()^2

Usage

roughness_K(beta)

Arguments

beta

numeric vector of positive coefficients.

Value

positive numeric representing the squared L2 norm, or roughness of the kernel with coefficients beta/norm_const(beta).

References

Hofmeyr, D.P. (2021) "Fast exact evaluation of univariate kernel sums", IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(2), 447-458.

Compute smoothed bin weights

Description

Computes total weight from an initial weight vector falling into regular bins, based on corresponding location of sample points. Individual points are shared between pairs of adjacent bins to produce slightly smoother estimation. Not intended for general use.

Variance of a kernel

Description

Computes the variance of the random variable whose density is given by a kernel implemented in FKSUM, with coefficients beta. NB: coefficients will first be normalised so that the kernel is a density function.

Usage

var_K(beta)

Arguments

beta

positive numeric vector of kernel coefficients.

Value

A positive numeric value representing the variance of the random variable with density given by the kernel.

Whitening (standardising) a data matrix

Description

Transforms data matrix by centering, projecting onto its first k principal components, and standardising the resulting components. The output is a zero mean, identity covariance data matrix. Only used for pre-processing prior to ICA. Not intended for alternative access.