Help for package CEGO

Version:

2.4.4

Title:

Combinatorial Efficient Global Optimization

Description:

Model building, surrogate model based optimization and Efficient Global Optimization in combinatorial or mixed search spaces.

License:

GPL (≥ 3)

Type:

Package

LazyLoad:

yes

Depends:

R (≥ 3.0.0)

Imports:

MASS, stats, DEoptim, graphics, quadprog, Matrix, methods, fastmatch, anticlust

Suggests:

nloptr

ByteCompile:

TRUE

Date:

2025-01-09

RoxygenNote:

7.3.2

Encoding:

UTF-8

NeedsCompilation:

yes

Packaged:

2025-01-09 20:31:05 UTC; Martin

Author:

Martin Zaefferer [aut, cre]

Maintainer:

Martin Zaefferer <mzaefferer@gmail.com>

Repository:

CRAN

Date/Publication:

2025-01-09 21:10:05 UTC

Combinatorial Efficient Global Optimization in R

Description

Combinatorial Efficient Global Optimization

Details

Model building, surrogate model based optimization and Efficient Global Optimization in combinatorial or mixed search spaces. This includes methods for distance calculation, modeling and handling of indefinite kernels/distances.

Package:	CEGO
Type:	Package
Version:	2.4.3
Date:	2024-01-27
License:	GPL (>= 3)
LazyLoad:	yes

Acknowledgments

This work has been partially supported by the Federal Ministry of Education and Research (BMBF) under the grants CIMO (FKZ 17002X11) and MCIOP (FKZ 17N0311).

Author(s)

Martin Zaefferer mzaefferer@gmail.com

References

Zaefferer, Martin; Stork, Joerg; Friese, Martina; Fischbach, Andreas; Naujoks, Boris; Bartz-Beielstein, Thomas. (2014). Efficient global optimization for combinatorial problems. In Proceedings of the 2014 conference on Genetic and evolutionary computation (GECCO '14). ACM, New York, NY, USA, 871-878. DOI=10.1145/2576768.2598282

Zaefferer, Martin; Stork, Joerg; Bartz-Beielstein, Thomas. (2014). Distance Measures for Permutations in Combinatorial Efficient Global Optimization. In Parallel Problem Solving from Nature - PPSN XIII (p. 373-383). Springer International Publishing.

Zaefferer, Martin and Bartz-Beielstein, Thomas (2016). Efficient Global Optimization with Indefinite Kernels. Parallel Problem Solving from Nature-PPSN XIV. Accepted, in press. Springer.

Create Flow shop Scheduling Problem (FSP) Benchmark

Description

Creates a benchmark function for the Flow shop Scheduling Problem.

Usage

benchmarkGeneratorFSP(a, n, m)

Arguments

a

matrix of processing times for each step and each machine

n

number of jobs

m

number of machines

Value

the function of type cost=f(permutation)

Examples

n=10
m=4
#ceate a matrix of processing times
A <- matrix(sample(n*m,replace=TRUE),n,m) 
#create FSP objective function 
fun <- benchmarkGeneratorFSP(A,n,m)
#evaluate
fun(1:n)
fun(n:1)

MaxCut Benchmark Creation

Description

Generates MaxCut problems, with binary decision variables. The MaxCut Problems are transformed to minimization problems by negation.

Usage

benchmarkGeneratorMaxCut(N, A)

Arguments

N

length of the bit strings

A

The adjacency matrix of the graph. Will be created at random if not provided.

Value

the function of type cost=f(bitstring). Returned fitness values will be negative, for purpose of minimization.

Examples

fun <- benchmarkGeneratorMaxCut(N=6)
fun(c(1,0,1,1,0,0))
fun(c(1,0,1,1,0,1))
fun(c(0,1,0,0,1,1))
fun <- benchmarkGeneratorMaxCut(A=matrix(c(0,1,0,1,1,0,1,0,0,1,0,1,1,0,1,0),4,4))
fun(c(1,0,1,0))
fun(c(1,0,1,1))
fun(c(0,1,0,1))

NK-Landscape Benchmark Creation

Description

Function that generates a NK-Landscapes.

Usage

benchmarkGeneratorNKL(N = 10, K = 1, PI = 1:K, g)

Arguments

N

length of the bit strings

K

number of neighbours contributing to fitness of one position

PI

vector, giving relative positions of each neighbour in the bit-string

g

set of fitness functions for each possible combination of string components. Will be randomly determined if not specified. Should have N rows, and 2^(K+1) columns.

Value

the function of type cost=f(bitstring). Returned fitness values will be negative, for purpose of minimization.

Examples

fun <- benchmarkGeneratorNKL(6,2)
fun(c(1,0,1,1,0,0))
fun(c(1,0,1,1,0,1))
fun(c(0,1,0,0,1,1))
fun <- benchmarkGeneratorNKL(6,3)
fun(c(1,0,1,1,0,0))
fun <- benchmarkGeneratorNKL(6,2,c(-1,1))
fun(c(1,0,1,1,0,0))
fun <- benchmarkGeneratorNKL(6,2,c(-1,1),g=matrix(runif(48),6))
fun(c(1,0,1,1,0,0))
fun(sample(c(0,1),6,TRUE))

Create Quadratic Assignment Problem (QAP) Benchmark

Description

Creates a benchmark function for the Quadratic Assignment Problem.

Usage

benchmarkGeneratorQAP(a, b)

Arguments

a

distance matrix

b

flow matrix

Value

the function of type cost=f(permutation)

Examples

set.seed(1)
n=5
#ceate a flow matrix
A <- matrix(0,n,n) 
for(i in 1:n){
	for(j in i:n){
		if(i!=j){
			A[i,j] <- sample(100,1)
			A[j,i] <- A[i,j]
 	}
	}
}
#create a distance matrix
locations <- matrix(runif(n*2)*10,,2)
B <- as.matrix(dist(locations))
#create QAP objective function 
fun <- benchmarkGeneratorQAP(A,B)
#evaluate
fun(1:n)
fun(n:1)

Create (Asymmetric) Travelling Salesperson Problem (TSP) Benchmark

Description

Creates a benchmark function for the (Asymmetric) Travelling Salesperson Problem. Path (Do not return to start of tour. Start and end of tour not fixed.) or Cycle (Return to start of tour). Symmetry depends on supplied distance matrix.

Usage

benchmarkGeneratorTSP(distanceMatrix, type = "Cycle")

Arguments

distanceMatrix

Matrix that collects the distances between travelled locations.

type

Can be "Cycle" (return to start, default) or "Path" (no return to start).

Value

the function of type cost=f(permutation)

Examples

set.seed(1)
#create 5 random locations to be part of a tour
n=5
cities <- matrix(runif(2*n),,2)
#calculate distances between cities
cdist <- as.matrix(dist(cities))
#create objective functions (for path or cycle)
fun1 <- benchmarkGeneratorTSP(cdist, "Path") 
fun2 <- benchmarkGeneratorTSP(cdist, "Cycle") 
#evaluate
fun1(1:n)
fun1(n:1)
fun2(n:1)
fun2(1:n)

Create single-machine total Weighted Tardiness (WT) Problem Benchmark

Description

Creates a benchmark function for the single-machine total Weighted Tardiness Problem.

Usage

benchmarkGeneratorWT(p, w, d)

Arguments

p

processing times

w

weights

d

due dates

Value

the function of type cost=f(permutation)

Examples

n=6
#processing times
p <- sample(100,n,replace=TRUE)
#weights
w <- sample(10,n,replace=TRUE)
#due dates
RDD <- c(0.2, 0.4, 0.6,0.8,1.0)
TF <- c(0.2, 0.4, 0.6,0.8,1.0)
i <- 1
j <- 1
P <- sum(p)
d <- runif(n,min=P*(1-TF[i]-RDD[j]/2),max=P*(1-TF[i]+RDD[j]/2))
#create WT objective function
fun <- benchmarkGeneratorWT(p,w,d)
fun(1:n)
fun(n:1)

Model building

Description

Model building support function for optimCEGO. Should not be called directly.

Usage

buildModel(res, distanceFunction, control)

Arguments

res

list with elements: (x) list of samples in input space and (y) matrix, column vector of observations for each sample

distanceFunction

a suitable distance function of type f(x1,x2), returning a scalar distance value, preferably between 0 and 1. Maximum distances larger 1 are no problem, but may yield scaling bias when different measures are compared. Should be non-negative and symmetric. In case Kriging is chosen, it can also be a list of several distance functions. In this case, MLE is used to determine the most suited distance measure (see the last reference).

control

list with options:

model: Model to be used as a surrogate of the target function. Default is "K" (Kriging). Also available are: "LM" (linear, distance-based model), "RBFN" Radial Basis Function Network.
modelSettings: List of settings for model building, passed on as the control argument to the model training functions modelKriging, modelLinear, modelRBFN.
infill: This parameter specifies a function to be used for the infill criterion (e.g., the default is expected improvement infillExpectedImprovement). To use no specific infill criterion this has to be set to NA. Infill criteria are only used with models that may provide some error estimate with predictions.

Value

a list:

fit: model-fit
fpred: prediction function

Linear Distance-Based Model

Description

DEPRECATED version of the linear, distance-based model, please use modelLinear

Usage

combinatorialLM(x, y, distanceFunction, control = list())

Arguments

x

list of samples in input space

y

column vector of observations for each sample

distanceFunction

a suitable distance function of type f(x1,x2), returning a scalar distance value

control

options for the model building procedure

Radial Basis Function Network

Description

DEPRECATED version of the RBFN model, please use modelRBFN

Usage

combinatorialRBFN(x, y, distanceFunction, control = list())

Arguments

x

list of samples in input space

y

column vector of observations for each sample

distanceFunction

a suitable distance function of type f(x1,x2), returning a scalar distance value

control

options for the model building procedure

Compute Correlation Matrix

Description

Compute the correlation matrix of samples x, given the model object.

Usage

computeCorrelationMatrix(object, x)

Arguments

object

fit of the Kriging model (settings and parameters), of class modelKriging.

x

list of samples / data

Value

the correlation matrix

Augmented Distance Correction

Description

Correct new (test) distances, via correcting the augmented distance matrix. Internal use only.

Usage

correctionAugmentedDistanceVector(d, object, x)

Arguments

d

new distance vector

object

a modelKriging fit

x

new samples (belonging to distances d)

Value

vector of augmented, corrected distances

References

Martin Zaefferer and Thomas Bartz-Beielstein. (2016). Efficient Global Optimization with Indefinite Kernels. Parallel Problem Solving from Nature-PPSN XIV. Accepted, in press. Springer.

Augmented Kernel Correction

Description

Correct new (test) kernel values, via correcting the augmented kernel matrix. Internal use only.

Usage

correctionAugmentedKernelVector(k, object, x)

Arguments

k

new kernel value vector

object

a modelKriging fit

x

new samples (belonging to kernel values k)

Value

vector of augmented, corrected kernel values

References

Martin Zaefferer and Thomas Bartz-Beielstein. (2016). Efficient Global Optimization with Indefinite Kernels. Parallel Problem Solving from Nature-PPSN XIV. Accepted, in press. Springer.

Correcting Conditional Negative Semi-Definiteness

Description

Correcting, e.g., a distance matrix with chosen methods so that it becomes a CNSD matrix.

Usage

correctionCNSD(mat, method = "flip", tol = 1e-08)

Arguments

mat

symmetric matrix, which should be at least of size 3x3

method

string that specifies method for correction: spectrum clip "clip", spectrum flip "flip", nearest definite matrix "near", spectrum square"square", spectrum diffusion "diffusion".

tol

torelance value. Eigenvalues between -tol and tol are assumed to be zero.

Value

the corrected CNSD matrix

References

Martin Zaefferer and Thomas Bartz-Beielstein. (2016). Efficient Global Optimization with Indefinite Kernels. Parallel Problem Solving from Nature-PPSN XIV. Accepted, in press. Springer.

Examples

x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2))
D <- distanceMatrix(x,distancePermutationInsert)
is.CNSD(D) #matrix should not be CNSD
D <- correctionCNSD(D)
is.CNSD(D) #matrix should now be CNSD
D
# note: to fix the negative distances, use repairConditionsDistanceMatrix. 
# Or else, use correctionDistanceMatrix.

Correcting Definiteness of a Matrix

Description

Correcting a (possibly indefinite) symmetric matrix with chosen approach so that it will have desired definiteness type: positive or negative semi-definite (PSD, NSD).

Usage

correctionDefinite(mat, type = "PSD", method = "flip", tol = 1e-08)

Arguments

mat

symmetric matrix

type

string that specifies type of correction: "PSD","NSD" to enforce PSD or NSD matrices respectively.

method

string that specifies method for correction: spectrum clip "clip", spectrum flip "flip", nearest definite matrix "near", spectrum square"square", spectrum diffusion "diffusion".

tol

torelance value. Eigenvalues between -tol and tol are assumed to be zero.

Value

list with

mat: corrected matrix
isIndefinite: boolean, whether original matrix was indefinite
lambda: the eigenvalues of the original matrix
lambdanew: the eigenvalues of the corrected matrix
U: the matrix of eigenvectors
a: the transformation vector

References

Martin Zaefferer and Thomas Bartz-Beielstein. (2016). Efficient Global Optimization with Indefinite Kernels. Parallel Problem Solving from Nature-PPSN XIV. Accepted, in press. Springer.

Examples

x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2))
D <- distanceMatrix(x,distancePermutationInsert)
is.NSD(D) #matrix should not be CNSD
D <- correctionDefinite(D,type="NSD")$mat
is.NSD(D) #matrix should now be CNSD
# different example: PSD kernel
D <- distanceMatrix(x,distancePermutationInsert)
K <- exp(-0.01*D)
is.PSD(K)
K <- correctionDefinite(K,type="PSD")$mat
is.PSD(K)

Correction of a Distance Matrix

Description

Convert (possibly non-euclidean or non-metric) distance matrix with chosen approach so that it becomes a CNSD matrix. Optionally, the resulting matrix is enforced to have positive elements and zero diagonal, with the repair parameter. Essentially, this is a combination of functions correctionDefinite or correctionCNSD with repairConditionsDistanceMatrix.

Usage

correctionDistanceMatrix(
  mat,
  type = "NSD",
  method = "flip",
  repair = TRUE,
  tol = 1e-08
)

Arguments

mat

symmetric distance matrix

type

string that specifies type of correction: "CNSD","NSD" to enforce CNSD or NSD matrices respectively.

method

string that specifies method for correction: spectrum clip "clip", spectrum flip "flip", nearest definite matrix "near", spectrum square"square", spectrum diffusion "diffusion", feature embedding "feature".

repair

boolean, whether or not to use condition repair, so that elements are positive, and diagonal is zero.

tol

torelance value. Eigenvalues between -tol and tol are assumed to be zero.

Value

list with corrected distance matrix mat, isCNSD (boolean, whether original matrix was CNSD) and transformation matrix A.

References

Martin Zaefferer and Thomas Bartz-Beielstein. (2016). Efficient Global Optimization with Indefinite Kernels. Parallel Problem Solving from Nature-PPSN XIV. Accepted, in press. Springer.

Examples

x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2))
D <- distanceMatrix(x,distancePermutationInsert)
is.CNSD(D) #matrix should not be CNSD
D <- correctionDistanceMatrix(D)$mat
is.CNSD(D) #matrix should now be CNSD
D

Correction of a Kernel (Correlation) Matrix

Description

Convert a non-PSD kernel matrix with chosen approach so that it becomes a PSD matrix. Optionally, the resulting matrix is enforced to have values between -1 and 1 and a diagonal of 1s, with the repair parameter. That means, it is (optionally) converted to a valid correlation matrix. Essentially, this is a combination of correctionDefinite with repairConditionsCorrelationMatrix.

Usage

correctionKernelMatrix(mat, method = "flip", repair = TRUE, tol = 1e-08)

Arguments

mat

symmetric kernel matrix

method

string that specifies method for correction: spectrum clip "clip", spectrum flip "flip", nearest definite matrix "near", spectrum square"square", spectrum diffusion "diffusion".

repair

boolean, whether or not to use condition repair, so that elements between -1 and 1, and the diagonal values are 1.

tol

torelance value. Eigenvalues between -tol and tol are assumed to be zero.

Value

list with corrected kernel matrix mat, isPSD (boolean, whether original matrix was PSD), transformation matrix A, the matrix of eigenvectors (U) and the transformation vector (a)

References

Martin Zaefferer and Thomas Bartz-Beielstein. (2016). Efficient Global Optimization with Indefinite Kernels. Parallel Problem Solving from Nature-PPSN XIV. Accepted, in press. Springer.

Examples

x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2))
D <- distanceMatrix(x,distancePermutationInsert)
K <- exp(-0.01*D)
is.PSD(K) #matrix should not be PSD
K <- correctionKernelMatrix(K)$mat
is.PSD(K) #matrix should now be CNSD
K

Simulation-based Test Function Generator, Object Interface

Description

Generate test functions for assessment of optimization algorithms with non-conditional or conditional simulation, based on real-world data. For a more streamlined interface, see testFunctionGeneratorSim.

Usage

createSimulatedTestFunction(
  xsim,
  fit,
  nsim = 10,
  conditionalSimulation = TRUE,
  seed = NA
)

Arguments

xsim

list of samples in input space, for simulation

fit

an object generated by modelKriging

nsim

the number of simulations, or test functions, to be created

conditionalSimulation

whether (TRUE) or not (FALSE) to use conditional simulation

seed

a random number generator seed. Defaults to NA; which means no seed is set. For sake of reproducibility, set this to some integer value.

Value

a list of functions, where each function is the interpolation of one simulation realization. The length of the list depends on the nsim parameter.

References

N. A. Cressie. Statistics for Spatial Data. JOHN WILEY & SONS INC, 1993.

C. Lantuejoul. Geostatistical Simulation - Models and Algorithms. Springer-Verlag Berlin Heidelberg, 2002.

Zaefferer, M.; Fischbach, A.; Naujoks, B. & Bartz-Beielstein, T. Simulation Based Test Functions for Optimization Algorithms Proceedings of the Genetic and Evolutionary Computation Conference 2017, ACM, 2017, 8.

Examples

nsim <- 10
seed <- 12345
n <- 6
set.seed(seed)
#target function:
fun <- function(x){
  exp(-20* x) + sin(6*x^2) + x
}
# "vectorize" target
f <- function(x){sapply(x,fun)}
# distance function
dF <- function(x,y)(sum((x-y)^2)) #sum of squares 
#start pdf creation
# plot params
par(mfrow=c(4,1),mar=c(2.3,2.5,0.2,0.2),mgp=c(1.4,0.5,0))
#test samples for plots
xtest <- as.list(seq(from=-0,by=0.005,to=1))
plot(xtest,f(xtest),type="l",xlab="x",ylab="Obj. function")
#evaluation samples (training)
xb <- as.list(runif(n)) 
yb <- f(xb)
# support samples for simulation
x <- as.list(sort(c(runif(100),unlist(xb))))
# fit the model	
fit <- modelKriging(xb,yb,dF,control=list(
   algThetaControl=list(method="NLOPT_GN_DIRECT_L",funEvals=100),useLambda=FALSE))
fit
#predicted obj. function values
ypred <- predict(fit,as.list(xtest))$y
plot(unlist(xtest),ypred,type="l",xlab="x",ylab="Estimation")
points(unlist(xb),yb,pch=19)
##############################	
# create test function non conditional
##############################
fun <- createSimulatedTestFunction(x,fit,nsim,FALSE,seed=1)
ynew <- NULL
for(i in 1:nsim)
  ynew <- cbind(ynew,fun[[i]](xtest))
rangeY <- range(ynew)
plot(unlist(xtest),ynew[,1],type="l",ylim=rangeY,xlab="x",ylab="Simulation")
for(i in 2:nsim){
  lines(unlist(xtest),ynew[,i],col=i,type="l")
}
##############################	
# create test function conditional
##############################
fun <- createSimulatedTestFunction(x,fit,nsim,TRUE,seed=1)
ynew <- NULL
for(i in 1:nsim)
  ynew <- cbind(ynew,fun[[i]](xtest))
rangeY <- range(ynew)
plot(unlist(xtest),ynew[,1],type="l",ylim=rangeY,xlab="x",ylab="Conditional sim.")
for(i in 2:nsim){
  lines(unlist(xtest),ynew[,i],col=i,type="l")
}
points(unlist(xb),yb,pch=19)
dev.off()

Max-Min-Distance Design

Description

Build a design of experiments in a sequential manner: First candidate solution is created at random. Afterwards, candidates are added sequentially, maximizing the minimum distances to the existing candidates. Each max-min problem is resolved by random sampling. The aim is to get a rather diverse design.

Usage

designMaxMinDist(x = NULL, cf, size, control = list())

Arguments

x

Optional list of user specified solutions to be added to the design/population, defaults to NULL

cf

Creation function, creates random new individuals

size

size of the design

control

list of controls. control$distanceFunction requires a distance function to compare two candidates created by cf. control$budget is the number of candidates for the random sampling, defaults to 100.

Value

Returns list with experimental design without duplicates

Examples

# Create a design of 10 permutations, each with n=5 elements, 
# and with 50 candidates for each sample.
# Note, that in this specific case the number of candidates 
# should be no larger than factorial(n).
# The default (hamming distance) is used.
design <- designMaxMinDist(NULL,function()sample(5),10,
		control=list(budget=50))
# Create a design of 20 real valued 2d vectors, 
# with 100 candidates for each sample
# using euclidean distance.
design <- designMaxMinDist(NULL,function()runif(2),20,
	control=list(budget=100,
	distanceFunction=function(x,y)sqrt(sum((x-y)^2))))
# plot the resulting design
plot(matrix(unlist(design),,2,byrow=TRUE))

Random Design

Description

Create a random initial population or experimental design, given a specifed creation function, as well as a optional set of user-specified design members and a maximum design size. Also removes duplicates from the design/population.

Usage

designRandom(x = NULL, cf, size, control = list())

Arguments

x

Optional list of user specified solutions to be added to the design, defaults to NULL

cf

Creation function, creates random new individuals

size

size of the design

control

not used

Value

Returns list with experimental design without duplicates

Examples

# Create a design of 10 permutations, each with 5 elements
design <- designRandom(NULL,function()sample(5),10)
# Create a design of 20 real valued 2d vectors
design <- designRandom(NULL,function()runif(2),20)

Calculate Distance Matrix

Description

Calculate the distance between all samples in a list, and return as matrix.

Usage

distanceMatrix(X, distFun, ...)

Arguments

X

list of samples, where each list element is a suitable input for distFun

distFun

Distance function of type f(x,y)=r, where r is a scalar and x and y are elements whose distance is evaluated.

...

further arguments passed to distFun

Value

The distance matrix

Examples

x <- list(5:1,c(2,4,5,1,3),c(5,4,3,1,2), sample(5))
distanceMatrix(x,distancePermutationHamming)

Update distance matrix

Description

Update an existing distance matrix D_mat by adding distances of all previous candidate solutions to one new candidate solution, d_vec= d(x_i,x_new).

Usage

distanceMatrixUpdate(distanceMat, x, distanceFunction, ...)

Arguments

distanceMat

original distance matrix D_mat

x

list of candidate solutions, last in list is the new solution

distanceFunction

Distance function of type f(x,y)=r, where r is a scalar and x and y are candidate solutions whose distance is evaluated.

...

further arguments passed to distanceFunction

Value

matrix of distances between all solutions x

Examples

x <- list(5:1,c(2,4,5,1,3),c(5,4,3,1,2))
dm <- distanceMatrix(x,distancePermutationHamming)
x <- append(x,list(1:5))
dmUp <- distanceMatrixUpdate(dm,x,distancePermutationHamming)

Distance Matrix Wrapper

Description

Wrapper to calculate the distance matrix, with one or multiple distance functions.

Usage

distanceMatrixWrapper(x, distanceFunction, ...)

Arguments

x

list of candidate solutions whose distance is evaluated

distanceFunction

Distance function of type f(x,y)=r, where r is a scalar and x and y are candidate solutions whose distance is evaluated.

...

further arguments passed to distanceFunction

Value

matrix of distances between all solutions in list x

Examples

x <- list(5:1,c(2,4,5,1,3),c(5,4,3,1,2))
dm1 <- distanceMatrix(x,distancePermutationHamming)
dm2 <- distanceMatrix(x,distancePermutationInsert)
dmBoth <- distanceMatrixWrapper(x,list(distancePermutationHamming,distancePermutationInsert))

Hamming Distance for Vectors

Description

The number of unequal elements of two vectors (which may be of unequal length), divided by the number of elements (of the larger vector).

Usage

distanceNumericHamming(x, y)

Arguments

x

first vector

y

second vector

Value

numeric distance value

d(x,y)

, scaled to values between 0 and 1

Examples

#e.g., used for distance between bit strings
x <- c(0,1,0,1,0)
y <- c(1,1,0,0,1)
distanceNumericHamming(x,y)
p <- replicate(10,sample(c(0,1),5,replace=TRUE),simplify=FALSE)
distanceMatrix(p,distanceNumericHamming)

Longest Common Substring for Numeric Vectors

Description

Longest common substring distance for two numeric vectors, e.g., bit vectors.

Usage

distanceNumericLCStr(x, y)

Arguments

x

first vector (numeric)

y

second vector (numeric)

Value

numeric distance value

d(x,y)

, scaled to values between 0 and 1

Examples

#e.g., used for distance between bit strings
x <- c(0,1,0,1,0)
y <- c(1,1,0,0,1)
distanceNumericLCStr(x,y)
p <- replicate(10,sample(c(0,1),5,replace=TRUE),simplify=FALSE)
distanceMatrix(p,distanceNumericLCStr)

Levenshtein Distance for Numeric Vectors

Description

Levenshtein distance for two numeric vectors, e.g., bit vectors.

Usage

distanceNumericLevenshtein(x, y)

Arguments

x

first vector (numeric)

y

second vector (numeric)

Value

numeric distance value

d(x,y)

, scaled to values between 0 and 1

Examples

#e.g., used for distance between bit strings
x <- c(0,1,0,1,0)
y <- c(1,1,0,0,1)
distanceNumericLevenshtein(x,y)
p <- replicate(10,sample(c(0,1),5,replace=TRUE),simplify=FALSE)
distanceMatrix(p,distanceNumericLevenshtein)

Adjacency Distance for Permutations

Description

Bi-directional adjacency distance for permutations, depending on how often two elements are neighbours in both permutations x and y.

Usage

distancePermutationAdjacency(x, y)

Arguments

x

first permutation (integer vector)

y

second permutation (integer vector)

Value

numeric distance value

d(x,y)

, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)

References

Sevaux, Marc, and Kenneth Soerensen. "Permutation distance measures for memetic algorithms with population management." Proceedings of 6th Metaheuristics International Conference (MIC'05). 2005.

Reeves, Colin R. "Landscapes, operators and heuristic search." Annals of Operations Research 86 (1999): 473-490.

Examples

x <- 1:5
y <- 5:1
distancePermutationAdjacency(x,y)
p <- replicate(10,sample(1:5),simplify=FALSE)
distanceMatrix(p,distancePermutationAdjacency)

Chebyshev Distance for Permutations

Description

Chebyshev distance for permutations. Specific to permutations is only the scaling to values of 0 to 1:

d(x,y) = \frac{max(|x - y|) }{ (n-1) }

where n is the length of the permutations x and y.

Usage

distancePermutationChebyshev(x, y)

Arguments

x

first permutation (integer vector)

y

second permutation (integer vector)

Value

numeric distance value

d(x,y)

, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)

Examples

x <- 1:5
y <- c(5,1,2,3,4)
distancePermutationChebyshev(x,y)
p <- replicate(10,sample(1:5),simplify=FALSE)
distanceMatrix(p,distancePermutationChebyshev)

Cosine Distance for Permutations

Description

The Cosine distance for permutations is derived from the Cosine similarity measure which has been applied in fields like text mining. It is based on the scalar product of two vectors (here: permutations).

Usage

distancePermutationCos(x, y)

Arguments

x

first permutation (integer vector)

y

second permutation (integer vector)

Value

numeric distance value

d(x,y)

, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)

References

Singhal, Amit (2001)."Modern Information Retrieval: A Brief Overview". Bulletin of the IEEE Computer Society Technical Committee on Data Engineering 24 (4): 35-43

Examples

x <- 1:5
y <- c(5,1,2,3,4)
distancePermutationCos(x,y)
p <- replicate(10,sample(1:5),simplify=FALSE)
distanceMatrix(p,distancePermutationCos)

Euclidean Distance for Permutations

Description

Euclidean distance for permutations, scaled to values between 0 and 1:

d(x,y) = \frac{1}{r} \sqrt(\sum_{i=1}^n (x_i - y_i)^2)

where n is the length of the permutations x and y, and scaling factor r=sqrt(2*4*c*(c+1)*(2*c+1)/6) with c=(n-1)/2 (if n is odd) or r=sqrt(2*c*(2*c-1)*(2*c+1)/3) with c=n/2 (if n is even).

Usage

distancePermutationEuclidean(x, y)

Arguments

x

first permutation (integer vector)

y

second permutation (integer vector)

Value

numeric distance value

d(x,y)

, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)

Examples

x <- 1:5
y <- c(5,1,2,3,4)
distancePermutationEuclidean(x,y)
p <- replicate(10,sample(1:5),simplify=FALSE)
distanceMatrix(p,distancePermutationEuclidean)

Hamming Distance for Permutations

Description

Hamming distance for permutations, scaled to values between 0 and 1. That is, the number of unequal elements of two permutations, divided by the permutations length.

Usage

distancePermutationHamming(x, y)

Arguments

x

first permutation (integer vector)

y

second permutation (integer vector)

Value

numeric distance value

d(x,y)

, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)

Examples

x <- 1:5
y <- c(5,1,2,3,4)
distancePermutationHamming(x,y)
p <- replicate(10,sample(1:5),simplify=FALSE)
distanceMatrix(p,distancePermutationHamming)

Insert Distance for Permutations

Description

The Insert Distance is an edit distance. It counts the minimum number of delete/insert operations required to transform one permutation into another. A delete/insert operation shifts one element to a new position. All other elements move accordingly to make place for the element. E.g., the following shows a single delete/insert move that sorts the corresponding permutation: 1 4 2 3 5 -> 1 2 3 4 5.

Usage

distancePermutationInsert(x, y)

Arguments

x

first permutation (integer vector)

y

second permutation (integer vector)

Value

numeric distance value

d(x,y)

, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)

References

Schiavinotto, Tommaso, and Thomas Stuetzle. "A review of metrics on permutations for search landscape analysis." Computers & operations research 34.10 (2007): 3143-3153.

Wikipedia contributors, "Longest increasing subsequence", Wikipedia, The Free Encyclopedia, 12 November 2014, 19:38 UTC, [accessed 13 November 2014]

Examples

x <- 1:5
y <- c(5,1,2,3,4)
distancePermutationInsert(x,y)
p <- replicate(10,sample(1:5),simplify=FALSE)
distanceMatrix(p,distancePermutationInsert)

Interchange Distance for Permutations

Description

The interchange distance is an edit-distance, counting how many edit operation (here: interchanges, i.e., transposition of two arbitrary elements) have to be performed to transform permutation x into permutation y.

Usage

distancePermutationInterchange(x, y)

Arguments

x

first permutation (integer vector)

y

second permutation (integer vector)

Value

numeric distance value

d(x,y)

, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)

References

Schiavinotto, Tommaso, and Thomas Stuetzle. "A review of metrics on permutations for search landscape analysis." Computers & operations research 34.10 (2007): 3143-3153.

Examples

x <- 1:5
y <- c(1,4,3,2,5)
distancePermutationInterchange(x,y)
p <- replicate(10,sample(1:5),simplify=FALSE)
distanceMatrix(p,distancePermutationInterchange)

Longest Common Substring Distance for Permutations

Description

Distance of permutations. Based on the longest string of adjacent elements that two permutations have in common.

Usage

distancePermutationLCStr(x, y)

Arguments

x

first permutation (integer vector)

y

second permutation (integer vector)

#' @return numeric distance value

d(x,y)

, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)

References

Hirschberg, Daniel S. "A linear space algorithm for computing maximal common subsequences." Communications of the ACM 18.6 (1975): 341-343.

Examples

x <- 1:5
y <- c(5,1,2,3,4)
distancePermutationLCStr(x,y)
p <- replicate(10,sample(1:5),simplify=FALSE)
distanceMatrix(p,distancePermutationLCStr)

Lee Distance for Permutations

Description

Usually a string distance, with slightly different definition. Adapted to permutations as:

d(x,y) = \sum_{i=1}^n min(|x_i - y_i|), n- |x_i - y_i|)

where n is the length of the permutations x and y.

Usage

distancePermutationLee(x, y)

Arguments

x

first permutation (integer vector)

y

second permutation (integer vector)

Value

numeric distance value

d(x,y)

, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)

References

Lee, C., "Some properties of nonbinary error-correcting codes," Information Theory, IRE Transactions on, vol.4, no.2, pp.77,82, June 1958

Examples

x <- 1:5
y <- c(5,1,2,3,4)
distancePermutationLee(x,y)
p <- replicate(10,sample(1:5),simplify=FALSE)
distanceMatrix(p,distancePermutationLee)

Levenshtein Distance for Permutations

Description

Levenshtein Distance, often just called "Edit Distance". The number of insertions, substitutions or deletions to turn one permutation (or string of equal length) into another.

Usage

distancePermutationLevenshtein(x, y)

Arguments

x

first permutation (integer vector)

y

second permutation (integer vector)

Value

numeric distance value

d(x,y)

, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)

References

Levenshtein, Vladimir I. "Binary codes capable of correcting deletions, insertions and reversals." Soviet physics doklady. Vol. 10. 1966.

Examples

x <- 1:5
y <- c(1,2,5,4,3)
distancePermutationLevenshtein(x,y)
p <- replicate(10,sample(1:5),simplify=FALSE)
distanceMatrix(p,distancePermutationLevenshtein)

Lexicographic permutation distance

Description

This function calculates the lexicographic permutation distance. That is the difference of positions that both positions would receive in a lexicographic ordering. Note, that this distance measure can quickly become inaccurate if the length of the permutations grows too large, due to being based on the factorial of the length. In general, permutations longer than 100 elements should be avoided.

Usage

distancePermutationLex(x, y)

Arguments

x

first permutation (integer vector)

y

second permutation (integer vector)

Value

numeric distance value

d(x,y)

, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)

Examples

x <- 1:5
y <- c(1,2,3,5,4)
distancePermutationLex(x,y)
p <- replicate(10,sample(1:5),simplify=FALSE)
distanceMatrix(p,distancePermutationLex)

Manhattan Distance for Permutations

Description

Manhattan distance for permutations, scaled to values between 0 and 1:

d(x,y) = \frac{1}{r} \sum_{i=1}^n |x_i - y_i|

where n is the length of the permutations x and y, and scaling factor r=(n^2-1)/2 (if n is odd) or r=((n^2)/2 (if n is even).

Usage

distancePermutationManhattan(x, y)

Arguments

x

first permutation (integer vector)

y

second permutation (integer vector)

Value

numeric distance value

d(x,y)

, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)

Examples

x <- 1:5
y <- c(5,1,2,3,4)
distancePermutationManhattan(x,y)
p <- replicate(10,sample(1:5),simplify=FALSE)
distanceMatrix(p,distancePermutationManhattan)

Position Distance for Permutations

Description

Position distance (or Spearmans Correlation Coefficient), scaled to values between 0 and 1.

Usage

distancePermutationPosition(x, y)

Arguments

x

first permutation (integer vector)

y

second permutation (integer vector)

Value

numeric distance value

d(x,y)

, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)

References

Schiavinotto, Tommaso, and Thomas Stuetzle. "A review of metrics on permutations for search landscape analysis." Computers & operations research 34.10 (2007): 3143-3153.

Reeves, Colin R. "Landscapes, operators and heuristic search." Annals of Operations Research 86 (1999): 473-490.

Examples

x <- 1:5
y <- c(1,3,5,4,2)
distancePermutationPosition(x,y)
p <- replicate(10,sample(1:5),simplify=FALSE)
distanceMatrix(p,distancePermutationPosition)

Squared Position Distance for Permutations

Description

Squared position distance (or Spearmans Footrule), scaled to values between 0 and 1.

Usage

distancePermutationPosition2(x, y)

Arguments

x

first permutation (integer vector)

y

second permutation (integer vector)

Value

numeric distance value

d(x,y)

, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)

References

Schiavinotto, Tommaso, and Thomas Stuetzle. "A review of metrics on permutations for search landscape analysis." Computers & operations research 34.10 (2007): 3143-3153.

Reeves, Colin R. "Landscapes, operators and heuristic search." Annals of Operations Research 86 (1999): 473-490.

Examples

x <- 1:5
y <- c(1,3,5,4,2)
distancePermutationPosition2(x,y)
p <- replicate(10,sample(1:5),simplify=FALSE)
distanceMatrix(p,distancePermutationPosition2)

R-Distance for Permutations

Description

R distance or unidirectional adjacency distance. Based on count of number of times that a two element sequence in x also occurs in y, in the same order.

Usage

distancePermutationR(x, y)

Arguments

x

first permutation (integer vector)

y

second permutation (integer vector)

Value

numeric distance value

d(x,y)

, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)

References

Sevaux, Marc, and Kenneth Soerensen. "Permutation distance measures for memetic algorithms with population management." Proceedings of 6th Metaheuristics International Conference (MIC'05). 2005.

Reeves, Colin R. "Landscapes, operators and heuristic search." Annals of Operations Research 86 (1999): 473-490.

Examples

x <- 1:5
y <- c(1,2,3,5,4)
distancePermutationR(x,y)
p <- replicate(10,sample(1:5),simplify=FALSE)
distanceMatrix(p,distancePermutationR)

Swap-Distance for Permutations

Description

The swap distance is an edit-distance, counting how many edit operation (here: swaps, i.e., transposition of two adjacent elements) have to be performed to transform permutation x into permutation y. Note: In v2.4.0 of CEGO and earlier, this function actually computed the swap distance on the inverted permutations (i.e., on the rankings, rather than orderin). This is now (v2.4.2 and later) corrected by inverting the permutations x and y before computing the distance (ie. computing ordering first). The original behavior can be reproduced by distancePermutationSwapInv. This issue was kindly reported by Manuel Lopez-Ibanez and the difference in terms of behavior is discussed by Ekhine Irurozki and him (2021).

Usage

distancePermutationSwap(x, y)

Arguments

x

first permutation (integer vector)

y

second permutation (integer vector)

Value

numeric distance value

d(x,y)

, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)

References

Schiavinotto, Tommaso, and Thomas Stuetzle. "A review of metrics on permutations for search landscape analysis." Computers & operations research 34.10 (2007): 3143-3153.

Irurozki, Ekhine and Ibanez-Lopez Unbalanced Mallows Models for Optimizing Expensive Black-Box Permutation Problems. In Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2021. ACM Press, New York, NY, 2021. doi: 10.1145/3449639.3459366

Examples

x <- 1:5
y <- c(1,2,3,5,4)
distancePermutationSwap(x,y)
p <- replicate(10,sample(1:5),simplify=FALSE)
distanceMatrix(p,distancePermutationSwap)

Inverse-Swap-Distance for Permutations

Description

The swap distance on the inverse of permutations x and y. See distancePermutationSwap for non-inversed version.

Usage

distancePermutationSwapInv(x, y)

Arguments

x

first permutation (integer vector)

y

second permutation (integer vector)

Value

numeric distance value

d(x,y)

, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)

Examples

x <- 1:5
y <- c(1,2,3,5,4)
distancePermutationSwapInv(x,y)
p <- replicate(10,sample(1:5),simplify=FALSE)
distanceMatrix(p,distancePermutationSwapInv)

Euclidean Distance

Description

The Euclidean distance for real vectors.

Usage

distanceRealEuclidean(x, y)

Arguments

x

first real vector

y

second real vector

Value

numeric distance value

d(x,y)

Examples

x <- runif(5)
y <- runif(5)
distanceRealEuclidean(x,y)

Levenshtein Distance forsSequences of numbers

Description

Levenshtein distance for two sequences of numbers

Usage

distanceSequenceLevenshtein(x, y)

Arguments

x

first vector (numeric vector)

y

second vector (numeric vector)

Value

numeric distance value

d(x,y)

Examples

#e.g., used for distance between integer sequence
x <- c(0,1,10,2,4)
y <- c(10,1,0,4,-4)
distanceSequenceLevenshtein(x,y)
p <- replicate(10,sample(1:5,3,replace=TRUE),simplify=FALSE)
distanceMatrix(p,distanceSequenceLevenshtein)

Hamming Distance for Strings

Description

Number of unequal letters in two strings.

Usage

distanceStringHamming(x, y)

Arguments

x

first string (class: character)

y

second string (class: character)

Value

numeric distance value

d(x,y)

Examples

distanceStringHamming("ABCD","AACC")

Longest Common Substring distance

Description

Distance between strings, based on the longest common substring.

Usage

distanceStringLCStr(x, y)

Arguments

x

first string (class: character)

y

second string (class: character)

Value

numeric distance value

d(x,y)

Examples

distanceStringLCStr("ABCD","AACC")

Levenshtein Distance for Strings

Description

Number of insertions, deletions and substitutions to transform one string into another

Usage

distanceStringLevenshtein(x, y)

Arguments

x

first string (class: character)

y

second string (class: character)

Value

numeric distance value

d(x,y)

Examples

distanceStringLevenshtein("ABCD","AACC")

Calculate Distance Vector

Description

Calculate the distance between a single sample and all samples in a list.

Usage

distanceVector(a, X, distFun, ...)

Arguments

a

A single sample which is a suitable input for distFun

X

list of samples, where each list element is a suitable input for distFun

distFun

Distance function of type f(x,y)=r, where r is a scalar and x and y are elements whose distance is evaluated.

...

further arguments passed to distFun

Value

A numerical vector of distances

Examples

x <- 1:5
y <- list(5:1,c(2,4,5,1,3),c(5,4,3,1,2))
distanceVector(x,y,distancePermutationHamming)

Cubic Kernel for Kriging

Description

Cubic Kernel for Kriging

Usage

fcorrCubic(D, theta = 0)

Arguments

D

distance matrix

theta

kernel parameter

Value

matrix (Psi)

Gaussian Kernel for Kriging

Description

Gaussian Kernel for Kriging

Usage

fcorrGauss(D, theta = 0)

Arguments

D

distance matrix

theta

kernel parameter

Value

matrix (Psi)

Linear Kernel for Kriging

Description

Linear Kernel for Kriging

Usage

fcorrLinear(D, theta = 0)

Arguments

D

distance matrix

theta

kernel parameter

Value

matrix (Psi)

Spherical Kernel for Kriging

Description

Spherical Kernel for Kriging

Usage

fcorrSphere(D, theta = 0)

Arguments

D

distance matrix

theta

kernel parameter

Value

matrix (Psi)

Negative Logarithm of Expected Improvement

Description

This function calculates the Expected Improvement" of candidate solutions, based on predicted means, standard deviations (uncertainty) and the best known objective function value so far.

Usage

infillExpectedImprovement(mean, sd, min)

Arguments

mean

predicted mean values

sd

predicted standard deviation

min

minimum of all observations so far

Value

Returns the negative logarithm of the Expected Improvement.

Check for Conditional Negative Semi-Definiteness

Description

This function checks whether a symmetric matrix is Conditionally Negative Semi-Definite (CNSD). Note that this function does not check whether the matrix is actually symmetric.

Usage

is.CNSD(X, method = "alg1", tol = 1e-08)

Arguments

X

a symmetric matrix

method

a string, specifiying the method to be used. "alg1" is based on algorithm 1 in Ikramov and Savel'eva (2000). "alg2" is based on theorem 3.2 in Ikramov and Savel'eva (2000). "eucl" is based on Glunt (1990).

tol

torelance value. Eigenvalues between -tol and tol are assumed to be zero.

Symmetric, CNSD matrices are, e.g., euclidean distance matrices, whiche are required to produce Positive Semi-Definite correlation or kernel matrices. Such matrices are used in models like Kriging or Support Vector Machines.

Value

boolean, which is TRUE if X is CNSD

References

Ikramov, K. and Savel'eva, N. Conditionally definite matrices, Journal of Mathematical Sciences, Kluwer Academic Publishers-Plenum Publishers, 2000, 98, 1-50

Glunt, W.; Hayden, T. L.; Hong, S. and Wells, J. An alternating projection algorithm for computing the nearest Euclidean distance matrix, SIAM Journal on Matrix Analysis and Applications, SIAM, 1990, 11, 589-600

Examples

# The following permutations will produce
# a non-CNSD distance matrix with Insert distance
# and a CNSD distance matrix with Hamming distance
x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2))
D <- distanceMatrix(x,distancePermutationInsert)
is.CNSD(D,"alg1")
is.CNSD(D,"alg2")
is.CNSD(D,"eucl")
D <- distanceMatrix(x,distancePermutationHamming)
is.CNSD(D,"alg1")
is.CNSD(D,"alg2")
is.CNSD(D,"eucl")

Check for Negative Semi-Definiteness

Description

This function checks whether a symmetric matrix is Negative Semi-Definite (NSD). That means, it is determined whether all eigenvalues of the matrix are non-positive. Note that this function does not check whether the matrix is actually symmetric.

Usage

is.NSD(X, tol = 1e-08)

Arguments

X

a symmetric matrix

tol

torelance value. Eigenvalues between -tol and tol are assumed to be zero.

Symmetric, NSD matrices are, e.g., correlation or kernel matrices. Such matrices are used in models like Kriging or Support Vector regression.

Value

boolean, which is TRUE if X is NSD

Examples

# The following permutations will produce
# a non-PSD kernel matrix with Insert distance
# and a PSD distance matrix with Hamming distance
# (for the given theta value of 0.01)-
# The respective negative should be (non-) NSD
x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2))
K <- exp(-0.01*distanceMatrix(x,distancePermutationInsert))
is.NSD(-K) 
K <- exp(-0.01*distanceMatrix(x,distancePermutationHamming))
is.NSD(-K)

Check for Positive Semi-Definiteness

Description

This function checks whether a symmetric matrix is Positive Semi-Definite (PSD). That means, it is determined whether all eigenvalues of the matrix are non-negative. Note that this function does not check whether the matrix is actually symmetric.

Usage

is.PSD(X, tol = 1e-08)

Arguments

X

a symmetric matrix

tol

torelance value. Eigenvalues between -tol and tol are assumed to be zero.

Symmetric, PSD matrices are, e.g., correlation or kernel matrices. Such matrices are used in models like Kriging or Support Vector regression.

Value

boolean, which is TRUE if X is PSD

Examples

# The following permutations will produce
# a non-PSD kernel matrix with Insert distance
# and a PSD distance matrix with Hamming distance
# (for the given theta value of 0.01)
x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2))
K <- exp(-0.01*distanceMatrix(x,distancePermutationInsert))
is.PSD(K)
K <- exp(-0.01*distanceMatrix(x,distancePermutationHamming))
is.PSD(K)

Calculate Kernel Matrix

Description

Calculate the similarities between all samples in a list, and return as matrix.

Usage

kernelMatrix(X, kernFun, ...)

Arguments

X

list of samples, where each list element is a suitable input for kernFun

kernFun

Kernel function of type f(x,y)=r, where r is a scalar and x and y are elements whose similarity is evaluated.

...

further arguments passed to distFun

Value

The similarity / kernel matrix

Examples

x <- list(5:1,c(2,4,5,1,3),c(5,4,3,1,2), sample(5))
kernFun <- function(x,y){
		exp(-distancePermutationHamming(x,y))
}
kernelMatrix(x,distancePermutationHamming)

Create Gaussian Landscape

Description

This function is loosely based on the Gaussian Landscape Generator by Bo Yuan and Marcus Gallagher. It creates a Gaussian Landscape every time it is called. This Landscape can be evaluated like a function. To adapt to combinatorial spaces, the Gaussians are here based on a user-specified distance measure. Due to the expected nature of combinatorial spaces and their lack of direction, the resulting Gaussians are much simplified in comparison to the continuous, vector-valued case (e.g., no rotation). Since the CEGO package is tailored to minimization, the landscape is inverted.

Usage

landscapeGeneratorGaussian(
  nGaussian = 10,
  theta = 1,
  ratio = 0.2,
  seed = 1,
  distanceFunction,
  creationFunction
)

Arguments

nGaussian

number of Gaussian components in the landscape. Default is 10.

theta

controls width of Gaussian components as a multiplier. Default is 1.

ratio

minimal function value of the local minima. Default is 0.2. (Note: Global minimum will be at zero, local minima will be in range [ratio;1])

seed

seed for the random number generator used before creation of the landscape. Generator status will be saved and reset afterwards.

distanceFunction

A function of type f(x,y), to evaluate distance between to samples in their given representation.

creationFunction

function to randomly generate the centers of the Gaussians, in form of their given representation.

Value

returns a function.The function requires a list of candidate solutions as its input, where each solution is suitable for use with the distance function.

References

B. Yuan and M. Gallagher (2003) "On Building a Principled Framework for Evaluating and Testing Evolutionary Algorithms: A Continuous Landscape Generator". In Proceedings of the 2003 Congress on Evolutionary Computation, IEEE, pp. 451-458, Canberra, Australia.

Examples

#rng seed
seed=101
# distance function
dF <- function(x,y)(sum((x-y)^2)) #sum of squares 
#dF <- function(x,y)sqrt(sum((x-y)^2)) #euclidean distance
# creation function
cF <- function()runif(1)
# plot pars
par(mfrow=c(3,1),mar=c(3.5,3.5,0.2,0.2),mgp=c(2,1,0))
## uni modal distance landscape
# set seed
set.seed(seed)
#landscape
lF <- landscapeGeneratorUNI(cF(),dF)
x <- as.list(seq(from=0,by=0.001,to=1))
plot(x,lF(x),type="l")
## multi-modal distance landscape
# set seed
set.seed(seed)
#landscape
lF <- landscapeGeneratorMUL(replicate(5,cF(),FALSE),dF)
plot(x,lF(x),type="l")
## glg landscape
#landscape
lF <- landscapeGeneratorGaussian(nGaussian=20,theta=1,
ratio=0.3,seed=seed,dF,cF)
plot(x,lF(x),type="l")

Gaussian Landscape Core function

Description

Core Gaussian landscape function. Should not be called directly, as it does not contain proper seed handling.

Usage

landscapeGeneratorGaussianBuild(nGaussian = 10, ratio = 0.2, creationFunction)

Arguments

nGaussian

number of Gaussian components in the landscape. Default is 10.

ratio

minimal function value of the local minima. Default is 0.2. (Note: Global minimum will be at zero, local minimal will be in range [ratio;1])

creationFunction

function to randomly generate the centers of the gaussians

Value

returns a list, with the following items:

centers: samples which are the centers of each Gaussian
covinv: inverse of variance of each Gaussian
opt: value at randomly chosen optimum center
nGauss: number of Gaussian components

Gaussian Landscape Evaluation

Description

Evaluate a Gaussian landscape. Should not be called directly.

Usage

landscapeGeneratorGaussianEval(x, glg)

Arguments

x

list of samples to evaluate

glg

list of values defining the Gaussian Landscape, created by landscapeGeneratorGaussianBuild.

Value

returns a list, with the following items:
value value of the combined landscape components value of each component

Multimodal Fitness Landscape

Description

This function generates multi-modal fitness landscapes based on distance measures. The fitness is the minimal distance to several reference individuals or centers. Hence, each reference individual is an optimum of the landscape.

Usage

landscapeGeneratorMUL(ref, distanceFunction)

Arguments

ref

list of reference individuals / centers

distanceFunction

Distance function, used to evaluate d(x,ref[[n]]), where x is an arbitrary new individual

Value

returns a function. The function requires a list of candidate solutions as its input, where each solution is suitable for use with the distance function. The function returns a numeric vector.

Examples

fun <- landscapeGeneratorMUL(ref=list(1:7,c(2,4,1,5,3,7,6)),distancePermutationCos)
x <- 1:7
fun(list(x))
x <- c(2,4,1,5,3,7,6)
fun(list(x))
x <- 7:1
fun(list(x))
x <- sample(7)
fun(list(x))
## multiple solutions at once:
x <- append(list(1:7,c(2,4,1,5,3,7,6)),replicate(5,sample(7),FALSE))
fun(x)

Unimodal Fitness Landscape

Description

This function generates uni-modal fitness landscapes based on distance measures. The fitness is the distance to a reference individual or center. Hence, the reference individual is the optimum of the landscape. This function is essentially a wrapper for the landscapeGeneratorMUL

Usage

landscapeGeneratorUNI(ref, distanceFunction)

Arguments

ref

reference individual

distanceFunction

Distance function, used to evaluate d(x,ref), where x is an arbitrary new individual

Value

returns a function. The function requires a list of candidate solutions as its input, where each solution is suitable for use with the distance function. The function returns a numeric vector.

References

Moraglio, Alberto, Yong-Hyuk Kim, and Yourim Yoon. "Geometric surrogate-based optimisation for permutation-based problems." Proceedings of the 13th annual conference companion on Genetic and evolutionary computation. ACM, 2011.

Examples

fun <- landscapeGeneratorUNI(ref=1:7,distancePermutationCos)
## for single solutions, note that the function still requires list input:
x <- 1:7
fun(list(x))
x <- 7:1
fun(list(x))
x <- sample(7)
fun(list(x))
## multiple solutions at once:
x <- replicate(5,sample(7),FALSE)
fun(x)

Lexicographic order number

Description

This function returns the position-number that a permutation would receive in a lexicographic ordering. It is used in the lexicographic distance measure.

Usage

lexicographicPermutationOrderNumber(x)

Arguments

x

permutation (integer vector)

Value

numeric value giving position in lexicographic order.

Examples

lexicographicPermutationOrderNumber(1:5)
lexicographicPermutationOrderNumber(c(1,2,3,5,4))
lexicographicPermutationOrderNumber(c(1,2,4,3,5))
lexicographicPermutationOrderNumber(c(1,2,4,5,3))
lexicographicPermutationOrderNumber(c(1,2,5,3,4))
lexicographicPermutationOrderNumber(c(1,2,5,4,3))
lexicographicPermutationOrderNumber(c(1,3,2,4,5))
lexicographicPermutationOrderNumber(5:1)
lexicographicPermutationOrderNumber(1:7)
lexicographicPermutationOrderNumber(7:1)

Kriging Model

Description

Implementation of a distance-based Kriging model, e.g., for mixed or combinatorial input spaces. It is based on employing suitable distance measures for the samples in input space.

Usage

modelKriging(x, y, distanceFunction, control = list())

Arguments

x

list of samples in input space

y

column vector of observations for each sample

distanceFunction

a suitable distance function of type f(x1,x2), returning a scalar distance value, preferably between 0 and 1. Maximum distances larger 1 are no problem, but may yield scaling bias when different measures are compared. Should be non-negative and symmetric. It can also be a list of several distance functions. In this case, Maximum Likelihood Estimation (MLE) is used to determine the most suited distance measure. The distance function may have additional parameters. For that case, see distanceParametersLower/Upper in the controls. If distanceFunction is missing, it can also be provided in the control list.

control

(list), with the options for the model building procedure:

lower: lower boundary for theta, default is 1e-6
upper: upper boundary for theta, default is 100
corr: function to be used for correlation modelling, default is fcorrGauss
algTheta: algorithm used to find theta (as well as p and lambda), default is optimInterface.
algThetaControl: list of controls passed to algTheta.
useLambda: whether or not to use the regularization constant lambda (nugget effect). Default is FALSE.
lambdaLower: lower boundary for lambda (log scale), default is -6
lambdaUpper: upper boundary for lambda (log scale), default is 0
distanceParametersLower: lower boundary for parameters of the distance function, default is NA which means there are no distance function parameters. If several distance functions are supplied, this should be a list of lower boundary vectors for each function.
distanceParametersUpper: upper boundary for parameters of the distance function, default is NA which means there are no distance function parameters. If several distance functions are supplied, this should be a list of upper boundary vectors for each function.
distances: a distance matrix. If available, this matrix is used for model building, instead of calculating the distance matrix using the parameters distanceFunction. Default is NULL.
scaling: If TRUE: Distances values are divided by maximum distance to avoid scale bias.
reinterpolate: If TRUE: reinterpolation is used to generate better uncertainty estimates in the presence of noise.
combineDistances: By default, several distance functions or matrices are subject to a likelihood based decision, choosing one. If this parameter is TRUE, they are instead combined by determining a weighted sum. The weighting parameters are determined by MLE.
userParameters: By default: (NULL). Else, this vector is used instead of MLE to specify the model parameters, in the following order: kernel parameters, distance weights, lambda, distance parameters.
indefiniteMethod: The specific method used for correction: spectrum "clip", spectrum "flip", spectrum "square", spectrum "diffusion", feature embedding "feature", nearest definite matrix "near". Default is no correction: "none". See Zaefferer and Bartz-Beielstein (2016).
indefiniteType: The general type of correction for indefiniteness: "NSD","CNSD" or the default "PSD". See Zaefferer and Bartz-Beielstein (2016). Note, that feature embedding may not work in case of multiple distance functions.
indefiniteRepair: boolean, whether conditions of the distance matrix (in case of "NSD","CNSD" correction type) or correlation matrix (in case of "PSD" correction type) are repaired.

Details

The basic Kriging implementation is based on the work of Forrester et al. (2008). For adaptation of Kriging to mixed or combinatorial spaces, as well as choosing distance measures with Maximum Likelihood Estimation, see the other two references (Zaefferer et al., 2014).

Value

an object of class modelKriging containing the options (see control parameter) and determined parameters for the model:

theta: parameters of the kernel / correlation function determined with MLE.
lambda: regularization constant (nugget) lambda
yMu: vector of observations y, minus MLE of mu
SSQ: Maximum Likelihood Estimate (MLE) of model parameter sigma^2
mu: MLE of model parameter mu
Psi: correlation matrix Psi
Psinv: inverse of Psi
nevals: number of Likelihood evaluations during MLE of theta/lambda/p
distanceFunctionIndexMLE: If a list of several distance measures (distanceFunction) was given, this parameter contains the index value of the measure chosen with MLE.

References

Forrester, Alexander I.J.; Sobester, Andras; Keane, Andy J. (2008). Engineering Design via Surrogate Modelling - A Practical Guide. John Wiley & Sons.

Zaefferer, Martin and Bartz-Beielstein, Thomas (2016). Efficient Global Optimization with Indefinite Kernels. Parallel Problem Solving from Nature-PPSN XIV. Accepted, in press. Springer.

Examples

# Set random number generator seed
set.seed(1)
# Simple test landscape
fn <- landscapeGeneratorUNI(1:5,distancePermutationHamming)
# Generate data for training and test
x <- unique(replicate(40,sample(5),FALSE))
xtest <- x[-(1:15)]
x <- x[1:15]
# Determin true objective function values
y <- fn(x)
ytest <- fn(xtest)
# Build model
fit <- modelKriging(x,y,distancePermutationHamming,
    control=list(algThetaControl=list(method="L-BFGS-B"),useLambda=FALSE))
# Predicted obj. function values
ypred <- predict(fit,xtest)$y
# Uncertainty estimate
fit$predAll <- TRUE
spred <- predict(fit,xtest)$s
# Plot
plot(ytest,ypred,xlab="true value",ylab="predicted value",
    pch=20,xlim=c(0.3,1),ylim=c(min(ypred)-0.1,max(ypred)+0.1))
segments(ytest, ypred-spred,ytest, ypred+spred)
epsilon = 0.02
segments(ytest-epsilon,ypred-spred,ytest+epsilon,ypred-spred)
segments(ytest-epsilon,ypred+spred,ytest+epsilon,ypred+spred)
abline(0,1,lty=2)
# Use a different/custom optimizer (here: SANN) for maximum likelihood estimation: 
# (Note: Bound constraints are recommended, to avoid Inf values.
# This is really just a demonstration. SANN does not respect bound constraints.)
optimizer1 <- function(x,fun,lower=NULL,upper=NULL,control=NULL,...){
  res <- optim(x,fun,method="SANN",control=list(maxit=100),...)
  list(xbest=res$par,ybest=res$value,count=res$counts)
}
fit <- modelKriging(x,y,distancePermutationHamming,
                   control=list(algTheta=optimizer1,useLambda=FALSE))
#One-dimensional optimizer (Brent). Note, that Brent will not work when 
#several parameters have to be set, e.g., when using nugget effect (lambda).
#However, Brent may be quite efficient otherwise.
optimizer2 <- function(x,fun,lower,upper,control=NULL,...){
 res <- optim(x,fun,method="Brent",lower=lower,upper=upper,...)
 list(xbest=res$par,ybest=res$value,count=res$counts)
}
fit <- modelKriging(x,y,distancePermutationHamming,
                    control=list(algTheta=optimizer2,useLambda=FALSE))

Build clustered model

Description

This function builds an ensemble of Gaussian Process model, where each individual model is fitted to a partition of the parameter space. Partitions are generated by.

Usage

modelKrigingClust(x, y, distanceFunction, control = list())

Arguments

x

y

distanceFunction

distanceFunction

control

control

Value

an object

Kriging: Distance Matrix Calculation

Description

Calculate and scale the distance matrix used in a Kriging model. Include definiteness correction. Not to be called directly.

Usage

modelKrigingDistanceCalculation(
  x,
  distanceFunction,
  parameters = NA,
  distances,
  scaling,
  combineDistances,
  indefiniteMethod,
  indefiniteType,
  indefiniteRepair,
  lower
)

Arguments

x

list of samples in input space

distanceFunction

a suitable distance function of type f(x1,x2), returning a scalar distance value, preferably between 0 and 1. Maximum distances larger 1 are no problem, but may yield scaling bias when different measures are compared. Should be non-negative and symmetric. It can also be a list of several distance functions. In this case, Maximum Likelihood Estimation (MLE) is used to determine the most suited distance measure. The distance function may have additional parameters.

parameters

parameters passed to the distance function as a vector.

distances

precomputed distances, set to NA if not available.

scaling

boolean, whether to scale the distance matrix.

combineDistances

boolean, whether to combine the distances of different functions.

indefiniteMethod

method for handling non-conditionally-definite matrices.

indefiniteType

type of handling for non-conditionally-definite matrices.

indefiniteRepair

whether to further repair other conditions (beside definiteness).

lower

lower boundary for distance function parameters.

Value

a list with elements D (distance matrix), maxD (maximal distance for scaling purpose).

Kriging: Initial guess and bounds

Description

Initialize parameter tuning for the Kriging model, setting the initial guess as well as bound constraints.

Usage

modelKrigingInit(
  startTheta = NULL,
  lowerTheta = NULL,
  upperTheta = NULL,
  useLambda,
  lambdaLower,
  lambdaUpper,
  combineDistances,
  nd,
  distanceParameters = F,
  distanceParametersLower = NA,
  distanceParametersUpper = NA
)

Arguments

startTheta

user provided start guess (optional).

lowerTheta

lower boundary for theta values (log scale), the kernel parameters.

upperTheta

upper boundary for theta values (log scale), the kernel parameters.

useLambda

boolean, whether nugget effect (lambda) is used.

lambdaLower

lower boundary for lambda (log scale).

lambdaUpper

upper boundary for lambda (log scale).

combineDistances

boolean, whether multiple distances are combined.

nd

number of distance function.

distanceParameters

whether the distance function parameters should be optimized

distanceParametersLower

lower boundary for parameters of the distance function, default is NA which means there are no distance function parameters. If several distance functions are supplied, this should be a list of lower boundary vectors for each function.

distanceParametersUpper

upper boundary for parameters of the distance function, default is NA which means there are no distance function parameters. If several distance functions are supplied, this should be a list of upper boundary vectors for each function.

Value

a list with elements x0 (start guess), lower (lower bound), upper (upper bound).

Kriging Prediction (internal)

Description

Predict with a model fit resulting from modelKriging.

Usage

modelKrigingInternalPredictor(object, x)

Arguments

object

fit of the Kriging model (settings and parameters), of class modelKriging.

x

list of samples to be predicted

Value

returns a list with:

y: predicted values
psi: correlations between x and training data

Calculate negative log-likelihood

Description

Used to determine theta/lambda/p values for the Kriging model in modelKriging with Maximum Likelihood Estimation (MLE).

Usage

modelKrigingLikelihood(
  xt,
  D,
  y,
  useLambda = FALSE,
  corr = fcorrGauss,
  indefiniteMethod = "none",
  indefiniteType = "PSD",
  indefiniteRepair = FALSE,
  returnLikelihoodOnly = TRUE,
  inverter = "chol",
  ntheta = 1
)

Arguments

xt

vector, containing parameters like theta, p and lambda.

D

matrix (or list of multiple matrices) of distances between training samples. In case of multiple distance matrices, theta (part of xt) has to be a vector, giving a weighting parameter for each matrix.

y

vector of observations at sample locations.

useLambda

whether to use nugget effect, i.e., lambda (FALSE at default).

corr

whether to use nugget effect, i.e., lambda (fcorrGauss at default).

indefiniteMethod

The specific method used for correction: spectrum "clip", spectrum "flip", spectrum "square", spectrum "diffusion", feature embedding "feature", nearest definite matrix "near". Default is no correction: "none". See Zaefferer and Bartz-Beielstein (2016).

indefiniteType

The general type of correction for indefiniteness: "NSD","CNSD" or the default "PSD". See Zaefferer and Bartz-Beielstein (2016).

indefiniteRepair

boolean, whether conditions of the distance matrix (in case of "NSD","CNSD" correction type) or correlation matrix (in case of "PSD" correction type) are repaired.

returnLikelihoodOnly

boolean, whether the function should return only the likelihood, or a else a list (see return information below).

inverter

string, defining the inverter to use. default "chol" is inversion via chol2inv. A different string will lead to use of solve.

ntheta

number of kernel parameters.

Value

the numeric Likelihood value (if returnLikelihoodOnly is TRUE) or a list with elements:

NegLnLike: concentrated log-likelihood *-1 for minimising
Psi: correlation matrix
Psinv: inverse of correlation matrix (to save computation time in forrRegPredictor)
mu: MLE of model parameter mu
yMu: vector of observations y minus mu
SSQ: MLE of model parameter sigma^2
a: transformation vector for eigenspectrum transformation, see Zaefferer and Bartz-Beielstein (2016)
U: Matrix of eigenvectors for eigenspectrum transformation, see Zaefferer and Bartz-Beielstein (2016)
isIndefinite: whether the uncorrected correlation (kernel) matrix is indefinite

References

Forrester, Alexander I.J.; Sobester, Andras; Keane, Andy J. (2008). Engineering Design via Surrogate Modelling - A Practical Guide. John Wiley & Sons.

Martin Zaefferer and Thomas Bartz-Beielstein. (2016). Efficient Global Optimization with Indefinite Kernels. Parallel Problem Solving from Nature-PPSN XIV. Accepted, in press. Springer.

Calculate negative log-likelihood

Description

This is a wrapper for the Kriging likelihood function modelKrigingLikelihood. It is intended for the case where parameters of the distance function are also optimized during maximum likelihood estimation. Thus, the wrapper receives the data, computes the parameterized distance matrix and passes it to the standard likelihood function.

Usage

modelKrigingParameterizedLikelihood(
  xt,
  xs,
  ys,
  useLambda = FALSE,
  corr = fcorrGauss,
  indefiniteMethod = "none",
  indefiniteType = "PSD",
  indefiniteRepair = FALSE,
  returnLikelihoodOnly = TRUE,
  inverter = "chol",
  distanceFunction,
  combineDistances,
  distanceParametersLower,
  ntheta,
  scaling
)

Arguments

xt

vector, containing parameters like theta, p and lambda.

xs

training samples, which are the input for the distance function. Should be in list format.

ys

vector of observations at training sample locations.

useLambda

whether to use nugget effect, i.e., lambda (FALSE at default).

corr

whether to use nugget effect, i.e., lambda (fcorrGauss at default).

indefiniteMethod

indefiniteType

The general type of correction for indefiniteness: "NSD","CNSD" or the default "PSD". See Zaefferer and Bartz-Beielstein (2016).

indefiniteRepair

boolean, whether conditions of the distance matrix (in case of "NSD","CNSD" correction type) or correlation matrix (in case of "PSD" correction type) are repaired.

returnLikelihoodOnly

boolean, whether the function should return only the likelihood, or a else a list (see return information below).

inverter

string specifying method for inversion of correlation matrix ("chol", cholesky decomposition at default, any other string leads to using the solve function).

distanceFunction

the distance function.

combineDistances

boolean, whether to combine several distances provided as a list of distance functions.

distanceParametersLower

lower boundary for the distance function(s) parameters. A vector in case of one distance, a list of vectors in case of several functions. The parameters are passed as a vector to each respective distance function.

ntheta

number of kernel parameters.

scaling

boolean, whether to scale the distance matrix.

Value

the numeric Likelihood value (if returnLikelihoodOnly is TRUE) or a list with elements:

NegLnLike: concentrated log-likelihood *-1 for minimising
Psi: correlation matrix
Psinv: inverse of correlation matrix (to save computation time in forrRegPredictor)
mu: MLE of model parameter mu
yMu: vector of observations y minus mu
SSQ: MLE of model parameter sigma^2
a: transformation vector for eigenspectrum transformation, see Zaefferer and Bartz-Beielstein (2016)
U: Matrix of eigenvectors for eigenspectrum transformation, see Zaefferer and Bartz-Beielstein (2016)
isIndefinite: whether the uncorrected correlation (kernel) matrix is indefinite

Distance based Linear Model

Description

A simple linear model based on arbitrary distances. Comparable to a k nearest neighbor model, but potentially able to extrapolate into regions of improvement. Used as a simple baseline by Zaefferer et al.(2014).

Usage

modelLinear(x, y, distanceFunction, control = list())

Arguments

x

list of samples in input space

y

matrix, vector of observations for each sample

distanceFunction

control

currently unused, defaults to list()

Value

a fit (list, modelLinear), with the options and found parameters for the model which has to be passed to the predictor function:

x: samples in input space (see parameters)
y: observations for each sample (see parameters)
distanceFunction: distance function (see parameters)

References

Examples

#set random number generator seed
set.seed(1)
#simple test landscape
fn <- landscapeGeneratorUNI(1:5,distancePermutationHamming)
#generate data for training and test
x <- unique(replicate(40,sample(5),FALSE))
xtest <- x[-(1:15)]
x <- x[1:15]
#determin true objective function values
y <- fn(x)
ytest <- fn(xtest)
#build model
fit <- modelLinear(x,y,distancePermutationHamming)
#predicted obj. function values
ypred <- predict(fit,xtest)$y
#plot
plot(ytest,ypred,xlab="true value",ylab="predicted value",
    pch=20,xlim=c(0.3,1),ylim=c(min(ypred)-0.1,max(ypred)+0.1))
abline(0,1,lty=2)

RBFN Model

Description

Implementation of a distance-based Radial Basis Function Network (RBFN) model, e.g., for mixed or combinatorial input spaces. It is based on employing suitable distance measures for the samples in input space. For reference, see the paper by Moraglio and Kattan (2011).

Usage

modelRBFN(x, y, distanceFunction, control = list())

Arguments

x

list of samples in input space

y

column vector of observations for each sample

distanceFunction

control

(list), with the options for the model building procedure:

beta: Parameter of the radial basis function: exp(-beta*D), where D is the distance matrix. If beta is not specified, the heuristic in fbeta will be used to determine it, which is default behavior.
fbeta: Function f(x) to calculate the beta parameter, x is the maximum distance observed in the input data. Default function is 1/(2*(x^2)).
distances: a distance matrix. If available, this matrix is used for model building, instead of calculating the distance matrix using the parameters distanceFunction. Default is NULL.

Value

a fit (list, modelRBFN), with the options and found parameters for the model which has to be passed to the predictor function:

SSQ: Variance of the observations (y)
centers: Centers of the RBFN model, samples in input space (see parameters)
w: Model parameters (weights) w
Phi: Gram matrix
Phinv: (Pseudo)-Inverse of Gram matrix
w0: Mean of observations (y)
dMax: Maximum observed distance
D: Matrix of distances between all samples
beta: See parameters
fbeta: See parameters
distanceFunction: See parameters

References

Moraglio, Alberto, and Ahmed Kattan. "Geometric generalisation of surrogate model based optimisation to combinatorial spaces." Evolutionary Computation in Combinatorial Optimization. Springer Berlin Heidelberg, 2011. 142-154.

Examples

#set random number generator seed
set.seed(1)
#simple test landscape
fn <- landscapeGeneratorUNI(1:5,distancePermutationHamming)
#generate data for training and test
x <- unique(replicate(40,sample(5),FALSE))
xtest <- x[-(1:15)]
x <- x[1:15]
#determin true objective function values
y <- fn(x)
ytest <- fn(xtest)
#build model
fit <- modelRBFN(x,y,distancePermutationHamming)
#predicted obj. function values
ypred <- predict(fit,xtest)$y
#plot
plot(ytest,ypred,xlab="true value",ylab="predicted value",
    pch=20,xlim=c(0.3,1),ylim=c(min(ypred)-0.1,max(ypred)+0.1))
abline(0,1,lty=2)

Bit-flip Mutation for Bit-strings

Description

Given a population of bit-strings, this function mutates all individuals by randomly inverting one or more bits in each individual.

Usage

mutationBinaryBitFlip(population, parameters)

Arguments

population

List of bit-strings

parameters

list of parameters: parameters$mutationRate => mutation rate, specifying number of bits flipped. Should be in range between zero and one

Value

mutated population

Block Inversion Mutation for Bit-strings

Description

Given a population of bit-strings, this function mutates all individuals by inverting a whole block, randomly selected.

Usage

mutationBinaryBlockInversion(population, parameters)

Arguments

population

List of bit-strings

parameters

list of parameters: parameters$mutationRate => mutation rate, specifying number of bits flipped. Should be in range between zero and one

Value

mutated population

Cycle Mutation for Bit-strings

Description

Given a population of bit-strings, this function mutates all individuals by cyclical shifting the string to the right or left.

Usage

mutationBinaryCycle(population, parameters)

Arguments

population

List of bit-strings

parameters

list of parameters: parameters$mutationRate => mutation rate, specifying number of bits flipped. Should be in range between zero and one

Value

mutated population

Single Bit-flip Mutation for Bit-strings

Description

Given a population of bit-strings, this function mutates all individuals by randomly inverting one bit in each individual. Due to the fixed mutation rate, this is computationally faster.

Usage

mutationBinarySingleBitFlip(population, parameters)

Arguments

population

List of bit-strings

parameters

not used

Value

mutated population

Insert Mutation for Permutations

Description

Given a population of permutations, this function mutates all individuals by randomly selecting two indices. The element at index1 is moved to positition index2, other elements

Usage

mutationPermutationInsert(population, parameters = list())

Arguments

population

List of permutations

parameters

list of parameters, currently only uses parameters$mutationRate, which should be between 0 and 1 (but can be larger than 1). The mutation rate determines the number of reversals performed, relative to the permutation length (N). 0 means none. 1 means N reversals. The default is 1/N.

Value

mutated population

Interchange Mutation for Permutations

Description

Given a population of permutations, this function mutates all individuals by randomly interchanging two arbitrary elements of the permutation.

Usage

mutationPermutationInterchange(population, parameters = list())

Arguments

population

List of permutations

parameters

list of parameters, currently only uses parameters$mutationRate, which should be between 0 and 1 (but can be larger than 1). The mutation rate determines the number of interchanges performed, relative to the permutation length (N). 0 means none. 1 means N interchanges. The default is 1/N.

Value

mutated population

Interchange of permutation elements

Description

Support function for mutationPermutationInterchange and mutationPermutationSwap.

Usage

mutationPermutationInterchangeCore(
  population,
  popsize,
  mutations,
  index1,
  index2
)

Arguments

population

List of permutations

popsize

population size

mutations

number of mutated elements for each individual

index1

vector of first indices, one element for each interchange

index2

vector of second indices, one element for each interchange

Value

mutated population

Reversal Mutation for Permutations

Description

Given a population of permutations, this function mutates all individuals by randomly selecting two indices, and reversing the respective sub-permutation.

Usage

mutationPermutationReversal(population, parameters = list())

Arguments

population

List of permutations

parameters

Value

mutated population

Swap Mutation for Permutations

Description

Given a population of permutations, this function mutates all individuals by randomly interchanging two adjacent elements of the permutation.

Usage

mutationPermutationSwap(population, parameters = list())

Arguments

population

List of permutations

parameters

list of parameters, currently only uses parameters$mutationRate, which should be between 0 and 1 (but can be larger than 1). The mutation rate determines the number of swaps performed, relative to the permutation length (N). 0 means none. 1 means N swaps. The default is 1/N.

Value

mutated population

Self-adaptive mutation operator

Description

This mutation function selects an operator and mutationRate (provided in parameters$mutationFunctions) based on self-adaptive parameters chosen for each individual separately.

Usage

mutationSelfAdapt(population, parameters)

Arguments

population

List of permutations

parameters

list, contains the available single mutation functions (mutationFunctions), and a data.frame that collects the chosen function and mutation rate for each individual (selfAdapt).

Examples

seed=0
N=5
#distance
dF <- distancePermutationHamming
#mutation
mFs <- c(mutationPermutationSwap,mutationPermutationInterchange,
mutationPermutationInsert,mutationPermutationReversal)
rFs <- c(recombinationPermutationCycleCrossover,recombinationPermutationOrderCrossover1,
recombinationPermutationPositionBased,recombinationPermutationAlternatingPosition)
mF <- mutationSelfAdapt
selfAdaptiveParameters <- list(
	mutationRate = list(type="numeric", lower=1/N,upper=1, default=1/N),
	mutationOperator = list(type="discrete", values=1:4, default=expression(sample(4,1))), 
#1: swap, 2: interchange, 3: insert, 4: reversal mutation
recombinationOperator = list(type="discrete", values=1:4, default=expression(sample(4,1))) 
#1: CycleX, 2: OrderX, 3: PositionX, 4: AlternatingPosition
)
#recombination
rF <-  recombinationSelfAdapt	 
#creation
cF <- function()sample(N)
#objective function
lF <- landscapeGeneratorUNI(1:N,dF)
#start optimization
set.seed(seed)
res <- optimEA(,lF,list(parameters=list(mutationFunctions=mFs,recombinationFunctions=rFs),
	creationFunction=cF,mutationFunction=mF,recombinationFunction=rF,
	popsize=15,budget=100,targetY=0,verbosity=1,selfAdaption=selfAdaptiveParameters,
	vectorized=TRUE)) ##target function is "vectorized", expects list as input
res$xbest

Mutation for Strings

Description

Given a population of strings, this function mutates all individuals by randomly changing an element of the string.

Usage

mutationStringRandomChange(population, parameters = list())

Arguments

population

List of permutations

parameters

list of parameters, with parameters$mutationRate and parameters$lts. parameters$mutationRate should be between 0 and 1 (but can be larger than 1). The mutation rate determines the number of interchanges performed, relative to the permutation length (N). 0 means none. 1 means N interchanges. The default is 1/N. parameters$lts are the possible letters in the string.

Value

mutated population

Nearest CNSD matrix

Description

This function implements the alternating projection algorithm by Glunt et al. (1990) to calculate the nearest conditionally negative semi-definite (CNSD) matrix (or: the nearest Euclidean distance matrix). The function is similar to the nearPD function from the Matrix package, which implements a very similar algorithm for finding the nearest Positive Semi-Definite (PSD) matrix.

Usage

nearCNSD(
  x,
  eig.tol = 1e-08,
  conv.tol = 1e-08,
  maxit = 1000,
  conv.norm.type = "F"
)

Arguments

x

symmetric matrix, to be turned into a CNSD matrix.

eig.tol

eigenvalue torelance value. Eigenvalues between -tol and tol are assumed to be zero.

conv.tol

convergence torelance value. The algorithm stops if the norm of the difference between two iterations is below this value.

maxit

maximum number of iterations. The algorithm stops if this value is exceeded, even if not converged.

conv.norm.type

type of norm, by default the F-norm (Frobenius). See norm for other choices.

Value

list with:

mat: nearestCNSD matrix
normF: F-norm between original and resulting matrices
iterations: the number of performed
rel.tol: the relative value used for the tolerance convergence criterion
converged: a boolean that records whether the algorithm

References

Examples

# example using Insert distance with permutations:
x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2))
D <- distanceMatrix(x,distancePermutationInsert)
print(D)
is.CNSD(D)
nearD <- nearCNSD(D)
print(nearD)
is.CNSD(nearD$mat)
# or example matrix from Glunt et al. (1990):
D <- matrix(c(0,1,1,1,0,9,1,9,0),3,3)
print(D)
is.CNSD(D)
nearD <- nearCNSD(D)
print(nearD)
is.CNSD(nearD$mat)
# note, that the resulting values given by Glunt et al. (1990) are 19/9 and 76/9

Two-Opt

Description

Implementation of a Two-Opt local search.

Usage

optim2Opt(x = NULL, fun, control = list())

Arguments

x

start solution of the local search

fun

function that determines cost or length of a route/permutation

control

(list), with the options:

archive: Whether to keep all candidate solutions and their fitness in an archive (TRUE) or not (FALSE). Default is TRUE.
budget: The limit on number of target function evaluations (stopping criterion) (default: 100)
creationFunction: Function to create individuals/solutions in search space. Default is a function that creates random permutations of length 6
vectorized: Boolean. Defines whether target function is vectorized (takes a list of solutions as argument) or not (takes single solution as argument). Default: FALSE

Value

a list with:

xbest: best solution found
ybest: fitness of the best solution
count: number of performed target function evaluations

References

Wikipedia contributors. "2-opt." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 13 Jun. 2014. Web. 21 Oct. 2014.

Examples

seed=0
#distance
dF <- distancePermutationHamming
#creation
cF <- function()sample(5)
#objective function
lF <- landscapeGeneratorUNI(1:5,dF)
#start optimization
set.seed(seed)
res <- optim2Opt(,lF,list(creationFunction=cF,budget=100,
   vectorized=TRUE)) ##target function is "vectorized", expects list of solutions as input
res

Combinatorial Efficient Global Optimization

Description

Model-based optimization for combinatorial or mixed problems. Based on measures of distance or dissimilarity.

Usage

optimCEGO(x = NULL, fun, control = list())

Arguments

x

Optional initial design as a list. If NULL (default), creationFunction (in control list) is used to create initial design. If x has less individuals than specified by control$evalInit, creationFunction will fill up the design.

fun

target function to be minimized

control

(list), with the options of optimization and model building approaches employed:

evalInit: Number of initial evaluations (i.e., size of the initial design), integer, default is 2
vectorized: Boolean. Defines whether target function is vectorized (takes a list of solutions as argument) or not (takes single solution as argument). Default: FALSE
verbosity: Level of text output during run. Defaults to 0, no output.
plotting: Plot optimization progress during run (TRUE) or not (FALSE). Default is FALSE.
targetY: optimal value to be found, stopping criterion, default is -Inf
budget: maximum number of target function evaluations, default is 100
creationRetries: When a model does not predict an actually improving solution, a random exploration step is performed. creationRetries solutions are created randomly. For each, distance to all known solutions is calculated. The minimum distance is recorded for each random solution. The random solution with maximal minimum distance is chosen doe be evaluated in the next iteration.
model: Model to be used as a surrogate of the target function. Default is "K" (Kriging). Also available are: "LM" (linear, distance-based model), "RBFN" Radial Basis Function Network.
modelSettings: List of settings for model building, passed on as the control argument to the model training functions modelKriging, modelLinear, modelRBFN.
infill: This parameter specifies a function to be used for the infill criterion (e.g., the default is expected improvement infillExpectedImprovement). To use no specific infill criterion this has to be set to NA, in which case the prediction of the surrogate model is used. Infill criteria are only used with models that may provide some error estimate with predictions.
optimizer: Optimizer that finds the minimum of the surrogate model. Default is optimEA, an Evolutionary Algorithm.
optimizerSettings: List of settings (control) for the optimizer function.
initialDesign: Design function that generates the initial design. Default is designMaxMinDist, which creates a design that maximizes the minimum distance between points.
initialDesignSettings: List of settings (control) for the initialDesign function.
creationFunction: Function to create individuals/solutions in search space. Default is a function that creates random permutations of length 6
distanceFunction: distanceFunction a suitable distance function of type f(x1,x2), returning a scalar distance value, preferably between 0 and 1. Maximum distances larger 1 are not a problem, but may yield scaling bias when different measures are compared. Should be non-negative and symmetric. With the setting control$model="K" this can also be a list of different fitness functions. Default is Hamming distance for permutations: distancePermutationHamming.

Value

a list:

xbest: best solution found
ybest: fitness of the best solution
x: history of all evaluated solutions
y: corresponding target function values f(x)
fit: model-fit created in the last iteration
fpred: prediction function created in the last iteration
count: number of performed target function evaluations
message: message string, giving information on termination reason
convergence: error/status code: -1 for termination due to failed model building, 0 for termination due to depleted budget, 1 if attained objective value is equal to or below target (control$targetY)

References

Examples

seed <- 0
#distance
dF <- distancePermutationHamming
#mutation
mF <- mutationPermutationSwap
#recombination
rF <-  recombinationPermutationCycleCrossover 
#creation
cF <- function()sample(5)
#objective function
lF <- landscapeGeneratorUNI(1:5,dF)
#start optimization
set.seed(seed)
res1 <- optimCEGO(,lF,list(
			creationFunction=cF,
			distanceFunction=dF,
			optimizerSettings=list(budget=100,popsize=10,
			mutationFunction=mF,recombinationFunction=rF),
	evalInit=5,budget=15,targetY=0,verbosity=1,model=modelKriging,
	vectorized=TRUE)) ##target function is "vectorized", expects list as input
set.seed(seed)
res2 <- optimCEGO(,lF,list(
			creationFunction=cF,
			distanceFunction=dF,
			optimizerSettings=list(budget=100,popsize=10,
			mutationFunction=mF,recombinationFunction=rF),
			evalInit=5,budget=15,targetY=0,verbosity=1,model=modelRBFN,
	vectorized=TRUE)) ##target function is "vectorized", expects list as input
res1$xbest 
res2$xbest

Evolutionary Algorithm for Combinatorial Optimization

Description

A basic implementation of a simple Evolutionary Algorithm for Combinatorial Optimization. Default evolutionary operators aim at permutation optimization problems.

Usage

optimEA(x = NULL, fun, control = list())

Arguments

x

Optional start individual(s) as a list. If NULL (default), creationFunction (in control list) is used to create initial design. If x has less individuals than the population size, creationFunction will fill up the rest.

fun

target function to be minimized

control

(list), with the options:

budget: The limit on number of target function evaluations (stopping criterion) (default: 1000).
popsize: Population size (default: 100).
generations: Number of generations (stopping criterion) (default: Inf).
targetY: Target function value (stopping criterion) (default: -Inf).
vectorized: Boolean. Defines whether target function is vectorized (takes a list of solutions as argument) or not (takes single solution as argument). Default: FALSE.
verbosity: Level of text output during run. Defaults to 0, no output.
plotting: Plot optimization progress during run (TRUE) or not (FALSE). Default is FALSE.
archive: Whether to keep all candidate solutions and their fitness in an archive (TRUE) or not (FALSE). Default is TRUE. New solutions that are identical to an archived one, will not be evaluated. Instead, their fitness is taken from the archive.
recombinationFunction: Function that performs recombination, default: recombinationPermutationCycleCrossover, which is cycle crossover for permutations.
recombinationRate: Number of offspring, defined by the fraction of the population (popsize) that will be recombined.
mutationFunction: Function that performs mutation, default: mutationPermutationSwap, which is swap mutation for permutations.
parameters: Default parameter list for the algorithm, e.g., mutation rate, etc.
selection: Survival selection process: "tournament" (default) or "truncation".
tournamentSize: Tournament size (default: 2).
tournamentProbability: Tournament probability (default: 0.9).
localSearchFunction: If specified, this function is used for a local search step. Default is NULL.
localSearchRate: Specifies on what fraction of the population local search is applied. Default is zero. Maximum is 1 (100 percent).
localSearchSettings: List of settings passed to the local search function control parameter.
stoppingCriterionFunction: Custom additional stopping criterion. Function evaluated on the population, receiving all individuals (list) and their fitness (vector). If the result is FALSE, the algorithm stops.
verbosity: >0 for text output.
creationFunction: Function to create individuals/solutions in search space. Default is a function that creates random permutations of length 6.
selfAdaption: An optional list object, that describes parameters of the optimization (see parameters) which are subject to self-adaption. An example is given in mutationSelfAdapt. Types of the parameters can be integer, discrete, or numeric.
selfAdaptTau: Positive numeric value, that controls the learning rate of numerical/integer self-adaptive parameters.
selfAdaptP: Value in [0,1]. A probability of mutation for all categorical, self-adaptive parameters.

Value

a list:

xbest: best solution found.
ybest: fitness of the best solution.
x: history of all evaluated solutions.
y: corresponding target function values f(x).
count: number of performed target function evaluations.
message: Termination message: Which stopping criterion was reached.
population: Last population.
fitness: Fitness of last population.

Examples

#First example: permutation optimization
seed=0
#distance
dF <- distancePermutationHamming
#mutation
mF <- mutationPermutationSwap
#recombination
rF <-  recombinationPermutationCycleCrossover 
#creation
cF <- function()sample(5)
#objective function
lF <- landscapeGeneratorUNI(1:5,dF)
#start optimization
set.seed(seed)
res <- optimEA(,lF,list(creationFunction=cF,mutationFunction=mF,recombinationFunction=rF,
	popsize=6,budget=60,targetY=0,verbosity=1,
	vectorized=TRUE)) ##target function is "vectorized", expects list as input
res$xbest 
#Second example: binary string optimization
#number of bits
N <- 50
#target function (simple example)
f <- function(x){
 sum(x)
}
#function to create random Individuals
cf <- function(){
		sample(c(FALSE,TRUE),N,replace=TRUE)
}
#control list
cntrl <- list(
	budget = 100,
	popsize = 5,
	creationFunction = cf,
	vectorized = FALSE, #set to TRUE if f evaluates a list of individuals
	recombinationFunction = recombinationBinary2Point,
	recombinationRate = 0.1,
	mutationFunction = mutationBinaryBitFlip,
	parameters=list(mutationRate = 1/N),
	archive=FALSE #recommended for larger budgets. do not change.
)
#start algorithm
set.seed(1)
res <- optimEA(fun=f,control=cntrl)
res$xbest
res$ybest

Optimization Interface (continuous, bounded)

Description

This function is an interface fashioned like the optim function. Unlike optim, it collects a set of bound-constrained optimization algorithms with local as well as global approaches. It is, e.g., used in the CEGO package to solve the optimization problem that occurs during parameter estimation in the Kriging model (based on Maximum Likelihood Estimation). Note that this function is NOT applicable to combinatorial optimization problems.

Usage

optimInterface(x, fun, lower = -Inf, upper = Inf, control = list(), ...)

Arguments

x

is a point (vector) in the decision space of fun

fun

is the target function of type y = f(x, ...)

lower

is a vector that defines the lower boundary of search space

upper

is a vector that defines the upper boundary of search space

control

is a list of additional settings. See details.

...

additional parameters to be passed on to fun

Details

The control list contains:

funEvals: stopping criterion, number of evaluations allowed for fun (defaults to 100)
reltol: stopping criterion, relative tolerance (default: 1e-6)
factr: stopping criterion, specifying relative tolerance parameter factr for the L-BFGS-B method in the optim function (default: 1e10)
popsize: population size or number of particles (default: 10*dimension, where dimension is derived from the length of the vector lower).
restarts: whether to perform restarts (Default: TRUE). Restarts will only be performed if some of the evaluation budget is left once the algorithm stopped due to some stopping criterion (e.g., reltol).
method: will be used to choose the optimization method from the following list: "L-BFGS-B" - BFGS quasi-Newton: stats Package optim function
"nlminb" - box-constrained optimization using PORT routines: stats Package nlminb function
"DEoptim" - Differential Evolution implementation: DEoptim Package
Additionally to the above methods, several methods from the package nloptr can be chosen. The complete list of suitable nlopt methods (non-gradient, bound constraints) is:
"NLOPT_GN_DIRECT","NLOPT_GN_DIRECT_L","NLOPT_GN_DIRECT_L_RAND", "NLOPT_GN_DIRECT_NOSCAL","NLOPT_GN_DIRECT_L_NOSCAL","NLOPT_GN_DIRECT_L_RAND_NOSCAL", "NLOPT_GN_ORIG_DIRECT","NLOPT_GN_ORIG_DIRECT_L","NLOPT_LN_PRAXIS", "NLOPT_GN_CRS2_LM","NLOPT_LN_COBYLA", "NLOPT_LN_NELDERMEAD","NLOPT_LN_SBPLX","NLOPT_LN_BOBYQA","NLOPT_GN_ISRES"

All of the above methods use bound constraints. For references and details on the specific methods, please check the documentation of the packages that provide them.

Value

This function returns a list with:

xbest: parameters of the found solution
ybest: target function value of the found solution
count: number of evaluations of fun

Mixed Integer Evolution Strategy (MIES)

Description

An optimization algorithm from the family of Evolution Strategies, designed to optimize mixed-integer problems: The search space is composed of continuous (real-valued) parameters, ordinal integers and categorical parameters. Please note that the categorical parameters need to be coded as integers (type should not be a factor or character). It is an implementation (with a slight modification) of MIES as described by Li et al. (2013). Note, that this algorithm always has a step size for each solution parameter, unlike Li et al., we did not include the option to change to a single step-size for all parameters. Dominant recombination is used for solution parameters (the search space parameters), intermediate recombination for strategy parameters (i.e., step sizes). Mutation: Self-adaptive, step sizes sigma are optimized alongside the solution parameters. Real-valued parameters are subject to variation based on independent normal distributed random variables. Ordinal integers are subject to variation based on the difference of geometric distributions. Categorical parameters are changed at random, with a self-adapted probability. Note, that a more simple bound constraint method is used. Instead of the Transformation Ta,b(x) described by Li et al., optimMIES simply replaces any value that exceeds the bounds by respective boundary value.

Usage

optimMIES(x = NULL, fun, control = list())

Arguments

x

fun

target function to be minimized.

control

(list), with the options:

budget: The limit on number of target function evaluations (stopping criterion) (default: 1000).
popsize: Population size (default: 100).
generations: Number of generations (stopping criterion) (default: Inf).
targetY: Target function value (stopping criterion) (default: -Inf).
vectorized: Boolean. Defines whether target function is vectorized (takes a list of solutions as argument) or not (takes single solution as argument). Default: FALSE.
verbosity: Level of text output during run. Defaults to 0, no output.
plotting: Plot optimization progress during run (TRUE) or not (FALSE). Default is FALSE.
archive: Whether to keep all candidate solutions and their fitness in an archive (TRUE) or not (FALSE). Default is TRUE.
stoppingCriterionFunction: Custom additional stopping criterion. Function evaluated on the population, receiving all individuals (list) and their fitness (vector). If the result is FALSE, the algorithm stops.
types: A vector that specifies the data type of each variable: "numeric", "integer" or "factor".
lower: Lower bound of each variable. Factor variables can have the lower bound set to NA.
upper: Upper bound of each variable. Factor variables can have the upper bound set to NA.
levels: List of levels for each variable (only relevant for categorical variables). Should be a vector of numerical values, usually integers, but not necessarily a sequence. HAS to be given if any factors/categoricals are present. Else, set to NA.

Details

The control variables types, lower, upper and levels are especially important.

Value

a list:

xbest: best solution found.
ybest: fitness of the best solution.
x: history of all evaluated solutions.
y: corresponding target function values f(x).
count: number of performed target function evaluations.
message: Termination message: Which stopping criterion was reached.
population: Last population.
fitness: Fitness of last population.

References

Rui Li, Michael T. M. Emmerich, Jeroen Eggermont, Thomas Baeck, Martin Schuetz, Jouke Dijkstra, and Johan H. C. Reiber. 2013. Mixed integer evolution strategies for parameter optimization. Evol. Comput. 21, 1 (March 2013), 29-64.

Examples

set.seed(1)
controlList <- list(lower=c(-5,-5,1,1,NA,NA),upper=c(10,5,10,10,NA,NA),
		types=c("numeric","numeric","integer","integer","factor","factor"),
	levels=list(NA,NA,NA,NA,c(1,3,5),1:4),
		vectorized = FALSE)
objFun <- function(x){		
		x[[3]] <- round(x[[3]])
		x[[4]] <- round(x[[4]])
		y <- sum(as.numeric(x[1:4])^2) 
		if(x[[5]]==1 & x[[6]]==4)
			y <- exp(y)
		else
			y <- y^2
		if(x[[5]]==3)
			y<-y-1	
		if(x[[5]]==5)
			y<-y-2	
		if(x[[6]]==1)
			y<-y*2
		if(x[[6]]==2)
			y<-y * 1.54
		if(x[[6]]==3)
			y<- y +2
		if(x[[6]]==4)
			y<- y * 0.5	
		if(x[[5]]==1)
			y<- y * 9	
		y	
	}
res <- optimMIES(,objFun,controlList)
res$xbest
res$ybest

Max-Min-Distance Optimizer

Description

One-shot optimizer: Create a design with maximum sum of distances, and evaluate. Best candidate is returned.

Usage

optimMaxMinDist(x = NULL, fun, control = list())

Arguments

x

Optional set of solution(s) as a list, which are added to the randomly generated solutions and are also evaluated with the target function.

fun

target function to be minimized

control

(list), with the options:

budget: The limit on number of target function evaluations (stopping criterion) (default: 100).
vectorized: Boolean. Defines whether target function is vectorized (takes a list of solutions as argument) or not (takes single solution as argument). Default: FALSE.
creationFunction: Function to create individuals/solutions in search space. Default is a function that creates random permutations of length 6.
designBudget: budget of the design function designMaxMinDist, which is the number of randomly created candidates in each iteration.

Value

a list:

xbest: best solution found
ybest: fitness of the best solution
x: history of all evaluated solutions
y: corresponding target function values f(x)
count: number of performed target function evaluations

Examples

seed=0
#distance
dF <- distancePermutationHamming
#creation
cF <- function()sample(5)
#objective function
lF <- landscapeGeneratorUNI(1:5,dF)
#start optimization
set.seed(seed)
res <- optimMaxMinDist(,lF,list(creationFunction=cF,budget=20,
	vectorized=TRUE)) ##target function is "vectorized", expects list as input
res$xbest

Combinatorial Random Search

Description

Random Search for mixed or combinatorial optimization. Solutions are generated completely at random.

Usage

optimRS(x = NULL, fun, control = list())

Arguments

x

Optional set of solution(s) as a list, which are added to the randomly generated solutions and are also evaluated with the target function.

fun

target function to be minimized

control

(list), with the options:

budget: The limit on number of target function evaluations (stopping criterion) (default: 100)
vectorized: Boolean. Defines whether target function is vectorized (takes a list of solutions as argument) or not (takes single solution as argument). Default: FALSE
creationFunction: Function to create individuals/solutions in search space. Default is a function that creates random permutations of length 6

Value

a list:

xbest: best solution found
ybest: fitness of the best solution
x: history of all evaluated solutions
y: corresponding target function values f(x)
count: number of performed target function evaluations

Examples

seed=0
#distance
dF <- distancePermutationHamming
#creation
cF <- function()sample(5)
#objective function
lF <- landscapeGeneratorUNI(1:5,dF)
#start optimization
set.seed(seed)
res <- optimRS(,lF,list(creationFunction=cF,budget=100,
	vectorized=TRUE)) ##target function is "vectorized", expects list as input
res$xbest

Optimize Surrogate Model

Description

Interface to the optimization of the surrogate model

Usage

optimizeModel(res, creationFunction, model, control)

Arguments

res

result state of the optimization process

creationFunction

Function to create individuals/solutions in search space.

model

result of the buildModel function

control

list of settings, from optimCEGO

Value

result list of the optimizer

Kriging Prediction

Description

Predict with a model fit resulting from modelKriging.

Usage

## S3 method for class 'modelKriging'
predict(object, x, ...)

Arguments

object

fit of the Kriging model (settings and parameters), of class modelKriging.

x

list of samples to be predicted

...

further arguments, not used

Value

Returned value depends on the setting of object$predAll
TRUE: list with function value (mean) object$y and uncertainty estimate object$s (standard deviation)
FALSE:object$yonly

Clustered Kriging Prediction

Description

Predict with a model fit resulting from modelKrigingClust.

Usage

## S3 method for class 'modelKrigingClust'
predict(object, newdata, ...)

Arguments

object

fit of the clustered Kriging model ensemble (settings and parameters), of class modelKrigingClust.

newdata

list of samples to be predicted

...

further arguments, currently not used

Value

list with function value (mean) object$y and uncertainty estimate object$s (standard deviation)

Predict: Combinatorial Kriging

Description

Predict with amodelLinear fit.

Usage

## S3 method for class 'modelLinear'
predict(object, x, ...)

Arguments

object

fit of the Kriging model (settings and parameters), of class modelLinear.

x

list of samples to be predicted

...

further arguments, not used

Value

numeric vector of predictions

Predict: Combinatorial RBFN

Description

Predict with a model fit resulting from modelRBFN.

Usage

## S3 method for class 'modelRBFN'
predict(object, x, ...)

Arguments

object

fit of the RBFN model (settings and parameters), of class modelRBFN.

x

list of samples to be predicted

...

further arguments, not used

Value

Returned value depends on the setting of object$predAll
TRUE: list with function value (mean) $y and uncertainty estimate $s (standard deviation)
FALSE:$yonly

Print Function: modelKriging

Description

Print information about a Kriging fit, as produced by modelKriging.

Usage

## S3 method for class 'modelKriging'
print(x, ...)

Arguments

x

fit returned by modelKriging.

...

additional parameters

Single Point Crossover for Bit Strings

Description

Given a population of bit-strings, this function recombines each individual with another individual by randomly specifying a single position. Information before that position is taken from the first parent, the rest from the second.

Usage

recombinationBinary1Point(population, parameters)

Arguments

population

List of bit-strings

parameters

not used

Value

population of recombined offspring

Two Point Crossover for Bit Strings

Description

Given a population of bit-strings, this function recombines each individual with another individual by randomly specifying 2 positions. Information in-between is taken from one parent, the rest from the other.

Usage

recombinationBinary2Point(population, parameters)

Arguments

population

List of bit-strings

parameters

not used

Value

population of recombined offspring

Arithmetic (AND) Crossover for Bit Strings

Description

Given a population of bit-strings, this function recombines each individual with another individual by computing parent1 & parent2 (logical AND).

Usage

recombinationBinaryAnd(population, parameters)

Arguments

population

List of bit-strings

parameters

not used

Value

population of recombined offspring

Uniform Crossover for Bit Strings

Description

Given a population of bit-strings, this function recombines each individual with another individual by randomly picking bits from each parent. Note, that optimEA will not pass the whole population to recombination functions, but only the chosen parents.

Usage

recombinationBinaryUniform(population, parameters)

Arguments

population

List of bit-strings

parameters

not used

Value

population of recombined offspring

Alternating Position Crossover (AP) for Permutations

Description

Given a population of permutations, this function recombines each individual with another individual. Note, that optimEA will not pass the whole population to recombination functions, but only the chosen parents.

Usage

recombinationPermutationAlternatingPosition(population, parameters)

Arguments

population

List of permutations

parameters

not used

Value

population of recombined offspring

Cycle Crossover (CX) for Permutations

Description

Usage

recombinationPermutationCycleCrossover(population, parameters)

Arguments

population

List of permutations

parameters

not used

Value

population of recombined offspring

Order Crossover 1 (OX1) for Permutations

Description

Usage

recombinationPermutationOrderCrossover1(population, parameters)

Arguments

population

List of permutations

parameters

not used

Value

population of recombined offspring

Position Based Crossover (POS) for Permutations

Description

Usage

recombinationPermutationPositionBased(population, parameters)

Arguments

population

List of permutations

parameters

not used

Value

population of recombined offspring

Self-adaptive recombination operator

Description

This recombination function selects an operator (provided in parameters$recombinationFunctions) based on self-adaptive parameters chosen for each individual separately.

Usage

recombinationSelfAdapt(population, parameters)

Arguments

population

List of permutations

parameters

list, contains the available single mutation functions (mutationFunctions), and a data.frame that collects the chosen function and mutation rate for each individual (selfAdapt).

Single Point Crossover for Strings

Description

Given a population of strings, this function recombines each individual with another random individual. Note, that optimEA will not pass the whole population to recombination functions, but only the chosen parents.

Usage

recombinationStringSinglePointCrossover(population, parameters)

Arguments

population

List of strings

parameters

not used

Value

population of recombined offspring

Remove Duplicates

Description

Remove duplicates in x, replace with non-duplicated individuals according to cf.

Usage

removeDuplicates(x, cf)

Arguments

x

List of individuals

cf

Creation function, creates random new individuals

Value

Returns x without duplicates

Remove Duplicates from Offsprings

Description

Remove duplicates in c(xhist,off), replace with non-duplicated individuals according to cf.

Usage

removeDuplicatesOffspring(xhist, off, cf, df = duplicated)

Arguments

xhist

List of previous individuals

off

List of offspring individuals

cf

Creation function, creates random new individuals

df

Dupliate Function. This function determines which elements in a list/population are duplicates. By default, this is the duplicated function from R-base.

Value

Returns off without duplicates

Repair Conditions of a Correlation Matrix

Description

This function repairs correlation matrices, so that the following two properties are ensured: The correlations values should be between -1 and 1, and the diagonal values should be one.

Usage

repairConditionsCorrelationMatrix(mat)

Arguments

mat

symmetric, PSD distance matrix. If your matrix is not CNSD, use correctionDefinite first. Or use correctionKernelMatrix.

Value

repaired correlation matrix

References

Martin Zaefferer and Thomas Bartz-Beielstein. (2016). Efficient Global Optimization with Indefinite Kernels. Parallel Problem Solving from Nature-PPSN XIV. Accepted, in press. Springer.

Examples

x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2))
D <- distanceMatrix(x,distancePermutationInsert)
K <- exp(-0.01*D)
K <- correctionDefinite(K,type="PSD")$mat
K
K <- repairConditionsCorrelationMatrix(K)

Repair Conditions of a Distance Matrix

Description

This function repairs distance matrices, so that the following two properties are ensured: The distance values should be non-zero and the diagonal should be zero. Other properties (conditionally negative semi-definitene (CNSD), symmetric) are assumed to be given.

Usage

repairConditionsDistanceMatrix(mat)

Arguments

mat

symmetric, CNSD distance matrix. If your matrix is not CNSD, use correctionCNSD first. Or use correctionDistanceMatrix.

Value

repaired distance matrix

References

Martin Zaefferer and Thomas Bartz-Beielstein. (2016). Efficient Global Optimization with Indefinite Kernels. Parallel Problem Solving from Nature-PPSN XIV. Accepted, in press. Springer.

Examples

x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2))
D <- distanceMatrix(x,distancePermutationInsert)
D <- correctionCNSD(D)
D
D <- repairConditionsDistanceMatrix(D)
D

Self-adaption of EA parameters.

Description

Learning / self-adaption of parameters of the evolutionary algorithm.

Usage

selfAdapt(
  params,
  inum,
  icat,
  iint,
  nnum,
  ncat,
  nint,
  lower,
  upper,
  values,
  tau,
  p
)

Arguments

params

parameters to be self-adapted

inum

boolean vector, which parameters are numeric

icat

boolean vector, which parameters are discrete, factors

iint

boolean vector, which parameters are integer

nnum

number of numerical parameters

ncat

number of discrete parameters

nint

number of integer parameters

lower

lower bounds (numeric, integer parameters only)

upper

upper bounds (numeric, integer parameters only)

values

values or levels of the discrete parameters

Kriging Simulation

Description

(Conditional) Simulate at given locations, with a model fit resulting from modelKriging. In contrast to prediction or estimation, the goal is to reproduce the covariance structure, rather than the data itself. Note, that the conditional simulation also reproduces the training data, but has a two times larger error than the Kriging predictor.

Usage

## S3 method for class 'modelKriging'
simulate(
  object,
  nsim = 1,
  seed = NA,
  xsim,
  conditionalSimulation = TRUE,
  returnAll = FALSE,
  ...
)

Arguments

object

fit of the Kriging model (settings and parameters), of class modelKriging.

nsim

number of simulations

seed

random number generator seed. Defaults to NA, in which case no seed is set

xsim

list of samples in input space, to be simulated

conditionalSimulation

logical, if set to TRUE (default), the simulation is conditioned with the training data of the Kriging model. Else, the simulation is non-conditional.

returnAll

if set to TRUE, a list with the simulated values (y) and the corresponding covariance matrix (covar) of the simulated samples is returned.

...

further arguments, not used

Value

Returned value depends on the setting of object$simulationReturnAll

References

N. A. Cressie. Statistics for Spatial Data. JOHN WILEY & SONS INC, 1993.

C. Lantuejoul. Geostatistical Simulation - Models and Algorithms. Springer-Verlag Berlin Heidelberg, 2002.

Binary String Generator Function

Description

Returns a function that generates random bit-strings of length N. Can be used to create individuals of NK-Landscapes or other problems with binary representation.

Usage

solutionFunctionGeneratorBinary(N)

Arguments

N

length of the bit-strings

Value

returns a function, without any arguments

Permutation Generator Function

Description

Returns a function that generates random permutations of length N. Can be used to generate individual solutions for permutation problems, e.g., Travelling Salesperson Problem

Usage

solutionFunctionGeneratorPermutation(N)

Arguments

N

length of the permutations returned

Value

returns a function, without any arguments

Examples

fun <- solutionFunctionGeneratorPermutation(10)
fun()
fun()
fun()

String Generator Function

Description

Returns a function that generates random strings of length N, with given letters. Can be used to generate individual solutions for permutation problems, e.g., Travelling Salesperson Problem

Usage

solutionFunctionGeneratorString(N, lts = c("A", "C", "G", "T"))

Arguments

N

length of the permutations returned

lts

letters allowed in the string

Value

returns a function, without any arguments

Examples

fun <- solutionFunctionGeneratorString(10,c("A","C","G","T"))
fun()
fun()
fun()

2-Opt Step

Description

Helper function: A single 2-opt step for optim2Opt

Usage

step2Opt(route, i, k)

Arguments

route

route to be partially reversed

i

start of reversal

k

end of reversal

Value

a new route

Simulation-based Test Function Generator, Data Interface

Description

Generate test functions for assessment of optimization algorithms with non-conditional or conditional simulation, based on real-world data.

Usage

testFunctionGeneratorSim(
  x,
  y,
  xsim,
  distanceFunction,
  controlModel = list(),
  controlSimulation = list()
)

Arguments

x

list of samples in input space, training data

y

column vector of observations for each sample, training data

xsim

list of samples in input space, for simulation

distanceFunction

a suitable distance function of type f(x1,x2), returning a scalar distance value, preferably between 0 and 1. Maximum distances larger 1 are no problem, but may yield scaling bias when different measures are compared. Should be non-negative and symmetric. It can also be a list of several distance functions. In this case, Maximum Likelihood Estimation (MLE) is used to determine the most suited distance measure. The distance function may have additional parameters. For that case, see distanceParametersLower/Upper in the controls. If distanceFunction is missing, it can also be provided in the control list.

controlModel

(list), with the options for the model building procedure, it will be passed to the modelKriging function.

controlSimulation

(list), with the parameters of the simulation:

nsim: the number of simulations, or test functions, to be created.
conditionalSimulation: whether (TRUE) or not (FALSE) to use conditional simulation.
simulationSeed: a random number generator seed. Defaults to NA; which means no seed is set. For sake of reproducibility, set this to some integer value.

Value

a list with the following elements: fun is a list of functions, where each function is the interpolation of one simulation realization. The length of the list depends on the nsim parameter. fit is the result of the modeling procedure, that is, the model fit of class modelKriging.

References

N. A. Cressie. Statistics for Spatial Data. JOHN WILEY & SONS INC, 1993.

C. Lantuejoul. Geostatistical Simulation - Models and Algorithms. Springer-Verlag Berlin Heidelberg, 2002.

Examples

nsim <- 10
seed <- 12345
n <- 6
set.seed(seed)
#target function:
fun <- function(x){
  exp(-20* x) + sin(6*x^2) + x
}
# "vectorize" target
f <- function(x){sapply(x,fun)}
dF <- function(x,y)(sum((x-y)^2)) #sum of squares 
# plot params
par(mfrow=c(4,1),mar=c(2.3,2.5,0.2,0.2),mgp=c(1.4,0.5,0))
#test samples for plots
xtest <- as.list(seq(from=-0,by=0.005,to=1))
plot(xtest,f(xtest),type="l",xlab="x",ylab="Obj. function")
#evaluation samples (training)
xb <- as.list(runif(n))
yb <- f(xb)
# support samples for simulation
x <- as.list(sort(c(runif(100),unlist(xb))))
# fit the model	and simulate: 
res <- testFunctionGeneratorSim(xb,yb,x,dF,
   list(algThetaControl=list(method="NLOPT_GN_DIRECT_L",funEvals=100),
     useLambda=FALSE),
   list(nsim=nsim,conditionalSimulation=FALSE))
fit <- res$fit
fun <- res$fun
#predicted obj. function values
ypred <- predict(fit,as.list(xtest))$y
plot(unlist(xtest),ypred,type="l",xlab="x",ylab="Estimation")
points(unlist(xb),yb,pch=19)
##############################	
# plot non-conditional simulation
##############################
ynew <- NULL
for(i in 1:nsim)
  ynew <- cbind(ynew,fun[[i]](xtest))
rangeY <- range(ynew)
plot(unlist(xtest),ynew[,1],type="l",ylim=rangeY,xlab="x",ylab="Simulation")
for(i in 2:nsim){
  lines(unlist(xtest),ynew[,i],col=i,type="l")
}
##############################	
# create and plot test function, conditional
##############################
fun <- testFunctionGeneratorSim(xb,yb,x,dF,
   list(algThetaControl=
     list(method="NLOPT_GN_DIRECT_L",funEvals=100),
			useLambda=FALSE),
   list(nsim=nsim,conditionalSimulation=TRUE))$fun
ynew <- NULL
for(i in 1:nsim)
  ynew <- cbind(ynew,fun[[i]](xtest))
rangeY <- range(ynew)
plot(unlist(xtest),ynew[,1],type="l",ylim=rangeY,xlab="x",ylab="Conditional sim.")
for(i in 2:nsim){
  lines(unlist(xtest),ynew[,i],col=i,type="l")
}
points(unlist(xb),yb,pch=19)

Tournament Selection

Description

Simple Tournament Selection implementation.

Usage

tournamentSelection(
  fitness,
  tournamentSize,
  tournamentProbability,
  selectFromN
)

Arguments

fitness

Fitness values of individuals

tournamentSize

Tournament Size

tournamentProbability

Tournament Probability

selectFromN

Number of tournament winners

Value

index of tournament winners

Combinatorial Efficient Global Optimization in R

Description

Details

Acknowledgments

Author(s)

References

See Also

Create Flow shop Scheduling Problem (FSP) Benchmark

Description

Usage

Arguments

Value

See Also

Examples

MaxCut Benchmark Creation

Description

Usage

Arguments

Value

Examples

NK-Landscape Benchmark Creation

Description

Usage

Arguments

Value

Examples

Create Quadratic Assignment Problem (QAP) Benchmark

Description

Usage

Arguments

Value

See Also

Examples

Create (Asymmetric) Travelling Salesperson Problem (TSP) Benchmark

Description

Usage

Arguments

Value

See Also

Examples

Create single-machine total Weighted Tardiness (WT) Problem Benchmark

Description

Usage

Arguments

Value

See Also

Examples

Model building

Description

Usage

Arguments

Value

See Also

Linear Distance-Based Model

Description

Usage

Arguments

Radial Basis Function Network

Description

Usage

Arguments

Compute Correlation Matrix

Description

Usage

Arguments

Value

See Also

Augmented Distance Correction

Description

Usage

Arguments

Value

References

Augmented Kernel Correction

Description

Usage

Arguments

Value

References

Correcting Conditional Negative Semi-Definiteness