Type: | Package |
Title: | Imputation of Time Series Based on Dynamic Time Warping |
Version: | 1.1 |
Date: | 2018-07-10 |
Author: | Camille Dezecache, T. T. Hong Phan, Emilie Poisson-Caillault |
Maintainer: | Emilie Poisson-Caillault <emilie.poisson@univ-littoral.fr> |
Description: | Functions to impute large gaps within time series based on Dynamic Time Warping methods. It contains all required functions to create large missing consecutive values within time series and to fill them, according to the paper Phan et al. (2017), <doi:10.1016/j.patrec.2017.08.019>. Performance criteria are added to compare similarity between two signals (query and reference). |
Depends: | R (≥ 3.0.0) |
Imports: | dtw, rlist, stats, e1071, entropy, lsa |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
RoxygenNote: | 6.0.1 |
URL: | http://mawenzi.univ-littoral.fr/DTWBI/ |
NeedsCompilation: | no |
Packaged: | 2018-07-10 14:10:33 UTC; camille |
Repository: | CRAN |
Date/Publication: | 2018-07-11 10:50:16 UTC |
Imputation of Time Series Based on Dynamic Time Warping
Description
Functions to impute large gaps within time series based on Dynamic Time Warping methods. It contains all required functions to create large missing consecutive values within time series and to fill them, according to the paper Phan et al. (2017), <DOI:10.1016/j.patrec.2017.08.019>. Performance criteria are added to compare similarity between two signals (query and reference).
Details
Index of help topics:
DTWBI-package Imputation of Time Series Based on Dynamic Time Warping DTWBI_univariate DTWBI algorithm for univariate signals compute.fa2 FA2 compute.fb Fractional Bias (FB) compute.fsd Fraction of Standard Deviation (FSD) compute.nmae Normalized Mean Absolute Error (NMAE) compute.rmse Root Mean Square Error (RMSE) compute.sim Similarity dataDTWBI Six univariate signals as example for DTWBI package dist_afbdtw Adaptive Feature Based Dynamic Time Warping algorithm gapCreation Gap creation local.derivative.ddtw Local derivative estimate to compute DDTW minCost DTW-based methods for univariate signals
Author(s)
Camille Dezecache, T. T. Hong Phan, Emilie Poisson-Caillault
Maintainer: Emilie Poisson-Caillault <emilie.poisson@univ-littoral.fr>
References
Thi-Thu-Hong Phan, Emilie Poisson-Caillault, Alain Lefebvre, Andre Bigand. Dynamic time warping- based imputation for univariate time series data. Pattern Recognition Letters, Elsevier, 2017, <DOI:10.1016/j.patrec.2017.08.019>. <hal-01609256>
Examples
# Load package dataset
data(dataDTWBI)
# Create a query and a reference signal
query <- dataDTWBI$query
ref <- dataDTWBI$query
# Create a gap within query (10% of signal size)
query <- gapCreation(query, rate = 0.1)
data <- query$output_vector
begin_gap <- query$begin_gap
size_gap <- query$gap_size
# Fill gap using DTWBI algorithm
results_DTWBI <- DTWBI_univariate(data, t_gap = begin_gap, T_gap = size_gap)
# Plot
plot(ref, type = "l")
lines(results_DTWBI$output_vector, col = "red", lty = "dashed")
# Compute the similarity of imputed vector and reference
compute.sim(ref, results_DTWBI$output_vector)
DTWBI algorithm for univariate signals
Description
Imputes values of a gap of position t_gap and size T in a univariate signal based on DTW algorithm. For more details on the method, see Phan et al. (2017) DOI: <10.1016/j.patrec.2017.08.019>. Default arguments of dtw() function are used but can be manually explicited and modified.
Usage
DTWBI_univariate(data, t_gap, T_gap, DTW_method = "DTW",
threshold_cos = NULL, step_threshold = NULL, thresh_cos_stop = 0.8, ...)
Arguments
data |
input vector containing a large and continuous gap (eventually derived from local.derivative.ddtw() function) |
t_gap |
location of the begining of the gap (eventually extracted from gapCreation function) |
T_gap |
gap size (eventually extracted from gapCreation function) |
DTW_method |
DTW method used for imputation ("DTW", "DDTW", "AFBDTW"). By default "DTW". |
threshold_cos |
threshold used to define similar sequences to the query. By default, threshold_cos=0.9995 if sequence is longer than 10'000, and threshold_cos=0.995 if shorter. |
step_threshold |
step used within the loop determining the threshold. By default, step_threshold=50 if sequence is longer than 10'000, step_threshold=10 if sequence length is between 1'000 and 10'000. Else, step_threshold=2. |
thresh_cos_stop |
Define the lowest cosine threshold acceptable to find a similar window to the query. By default, thresh_cos_stop=0.8. |
... |
additional arguments from the dtw() function |
Value
DTWBI_univariate returns a list containing the following elements:
output_vector: output vector containing complete data including the imputation proposal
input_vector: original vector used as input
query: the query i.e. the adjacent sequence to the gap
pos_query: index of the begining and end of the query
sim_window: vector containing the values of the most similar sequence to the query
pos_sim_window: index of the begining and end of the similar window
imputation_window: vector containing imputed values
pos_imp_window: index of the begining and end of the imputation window
Author(s)
Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault
Examples
data(dataDTWBI)
X <- dataDTWBI[, 1]
rate <- 0.1
output <- gapCreation(X, rate)
data <- output$output_vector
gap_begin <- output$begin_gap
gap_size <- output$gap_size
imputed_data <- DTWBI_univariate(data, t_gap=gap_begin, T_gap=gap_size)
plot(imputed_data$input_vector, type = "l", lwd = 2) # Uncomplete signal
lines(imputed_data$output_vector, col = "red") # Imputed signal
lines(y = imputed_data$query,
x = imputed_data$pos_query[1]:imputed_data$pos_query[2],
col = "green", lwd = 4) # Query
lines(y = imputed_data$sim_window,
x = imputed_data$pos_sim_window[1]:imputed_data$pos_sim_window[2],
col = "orange", lwd = 4) # Similar sequence to the query
lines(y = imputed_data$imputation_window,
x = imputed_data$pos_imp_window[1]:imputed_data$pos_imp_window[2],
col = "blue", lwd = 4) # Imputing proposal
FA2
Description
Estimates the FA2 of two univariate signals Y (imputed values) and X (true values).
Usage
compute.fa2(Y, X, verbose = F)
Arguments
Y |
vector of imputed values |
X |
vector of true values |
verbose |
if TRUE, print advice about the quality of the model |
Details
This function returns the value of FA2 of two vectors corresponding to univariate signals X (true values) and Y (imputed values).
This FA2 corresponds to the percentage of pairs of values (x_{i}, y_{i}
) satisfying the condition 0,5 <= (Y_{i}/X_{i}) <= 2
.
The closer FA2 is to 1, the more accurate is the imputation model.
Both vectors Y and X must be of equal length, on the contrary an error will be displayed.
In both input vectors, eventual NA will be exluded with a warning diplayed.
Author(s)
Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault
Examples
data(dataDTWBI)
X <- dataDTWBI[, 1] ; Y <- dataDTWBI[, 2]
compute.fa2(Y,X)
compute.fa2(Y,X, verbose = TRUE)
# By definition, if pairs of true and imputed values are zero,
# FA2 corresponding to this pair of values equals 1.
X[1] <- 0
Y[1] <- 0
compute.fa2(Y,X)
Fractional Bias (FB)
Description
Estimates the Fractional Bias (FB) of two univariate signals Y (imputed values) and X (true values).
Usage
compute.fb(Y, X, verbose = F)
Arguments
Y |
vector of imputed values |
X |
vector of true values |
verbose |
if TRUE, print advice about the quality of the model |
Details
This function returns the value of FB of two vectors corresponding to univariate signals, indicating whether predicted values are underestimated or overestimated compared to true values.
A perfect imputation model gets FB = 0
.
An acceptable imputation model gives FB <= 0.3
.
Both vectors Y and X must be of equal length, on the contrary an error will be displayed.
In both input vectors, eventual NA will be exluded with a warning diplayed.
Author(s)
Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault
Examples
data(dataDTWBI)
X <- dataDTWBI[, 1] ; Y <- dataDTWBI[, 2]
compute.fb(Y,X)
compute.fb(Y,X, verbose = TRUE)
# If mean(X)=mean(Y)=0, it is impossible to estimate FB,
# unless both true and imputed values vectors are constant.
# By definition, in this case, FB = 0.
X <- rep(0, 10) ; Y <- rep(0, 10)
compute.fb(Y,X)
# If true and imputed values are not zero and are opposed, FB = Inf.
X <- rep(runif(1), 10)
Y <- -X
compute.fb(Y,X)
Fraction of Standard Deviation (FSD)
Description
Estimates the Fraction of Standard Deviation (FSD) of two univariate signals Y (imputed values) and X (true values).
Usage
compute.fsd(Y, X, verbose = F)
Arguments
Y |
vector of imputed values |
X |
vector of true values |
verbose |
if TRUE, print advice about the quality of the model |
Details
This function returns the value of FSD of two vectors corresponding to univariate signals. Values of FSD closer to zero indicate a better performance method for the imputation task. Both vectors Y and X must be of equal length, on the contrary an error will be displayed. In both input vectors, eventual NA will be exluded with a warning diplayed.
Author(s)
Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault
Examples
data(dataDTWBI)
X <- dataDTWBI[, 1] ; Y <- dataDTWBI[, 2]
compute.fsd(Y,X)
compute.fsd(Y,X, verbose = TRUE)
# By definition, if true and imputed values are equal and constant,
# FSD = 0.
X <- rep(runif(1), 10)
Y <- X
compute.fsd(Y,X)
# However, if true and imputed values are constant but different,
# FSD is not calculable. An error is displayed.
## Not run:
X <- rep(runif(1), 10);Y <- rep(runif(1), 10)
compute.fsd(Y,X)
## End(Not run)
Normalized Mean Absolute Error (NMAE)
Description
Estimates the Normalized Mean Absolute Error of two univariate signals Y (imputed values) and X (true values).
Usage
compute.nmae(Y, X)
Arguments
Y |
vector of imputed values |
X |
vector of true values |
Details
This function returns the value of NMAE of two vectors corresponding to univariate signals.
A lower NMAE (NMAE \in [0, \inf]
) value indicates a better performance method for the imputation task.
Both vectors Y and X must be of equal length, on the contrary an error will be displayed.
In both input vectors, eventual NA will be exluded with a warning diplayed.
Author(s)
Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault
Examples
data(dataDTWBI)
X <- dataDTWBI[, 1] ; Y <- dataDTWBI[, 2]
compute.nmae(Y,X)
# If true values is a constant vector, NMAE = Inf.
# A warning is displayed and MAE is estimated instead of NMAE,
# unless true and imputed values are equal. In this case,
# by definition, NMAE = 0.
X <- rep(0, 10)
Y <- runif(10)
compute.nmae(Y,X) # MAE computed
Y <- X
compute.nmae(Y,X) # By definition, NMAE = 0
Root Mean Square Error (RMSE)
Description
Estimates the Root Mean Square Error of two univariate signals Y (imputed values) and X (true values).
Usage
compute.rmse(Y, X)
Arguments
Y |
vector of imputed values |
X |
vector of true values |
Details
This function returns the value of RMSE of two vectors corresponding to univariate signals.
A lower RMSE (RMSE \in [0, \inf]
) value indicates a better performance method for the imputation task.
Both vectors Y and X must be of equal length, on the contrary an error will be displayed.
In both input vectors, eventual NA will be exluded with a warning diplayed.
Author(s)
Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault
Examples
data(dataDTWBI)
X <- dataDTWBI[, 1] ; Y <- dataDTWBI[, 2]
compute.rmse(Y,X)
Similarity
Description
Estimates the percentage of similarity of two univariate signals Y (imputed values) and X (true values).
Usage
compute.sim(Y, X)
Arguments
Y |
vector of imputed values |
X |
vector of true values |
Details
This function returns the value of similarity of two vectors corresponding to univariate signals.
A higher similarity (Similarity \in [0, 1]
) highlights a more accurate method for completing missing values in univariate datasets.
Both vectors Y and X must be of equal length, on the contrary an error will be displayed.
In both input vectors, eventual NA will be excluded with a warning diplayed.
Author(s)
Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault
Examples
data(dataDTWBI)
X <- dataDTWBI[, 1] ; Y <- dataDTWBI[, 2]
compute.sim(Y,X)
# By definition, if true values is a constant vector
# and one or more imputed values are equal to the true values,
# similarity = 1.
X <- rep(2, 10)
Y <- X
compute.sim(Y,X)
Six univariate signals as example for DTWBI package
Description
Query and ref1 are two dephased sigmoidal signals. Ref2 presents a linear decrease. Ref3 and ref4 are constant signals of value 3 and 0 respectively. Ref5 is similar to the query with small noise added.
Usage
dataDTWBI
Format
A data frame with six variables: query
, ref1
,
ref2
, ref3
, ref4
and ref5
.
Adaptive Feature Based Dynamic Time Warping algorithm
Description
This function estimates a distance matrix which is used as an input in dtw() function (package dtw) to align two univariate signals following Adaptative Feature Based Dynamic Time Warping algorithm (AFBDTW).
Usage
dist_afbdtw(q, r, w1 = 0.5)
Arguments
q |
query vector |
r |
reference vector |
w1 |
weight of local feature VS global feature. By default, w1 = 0.5, and by definition, w2 = 1 - w1. |
Value
A list containing the following elements:
query: the query vector
response: the response vector
query_local: local feature of the query
response_local: local feature of the response vector
query_global: global feature of the query
response_global: global feature of the response vector
dist_local: distance matrix of the local feature
dist_local: distance matrix of the global feature
distAFBDTW: AFBDTW distance matrix
Author(s)
Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault
Examples
data(dataDTWBI)
X <- dataDTWBI[, 1] ; Y <- dataDTWBI[, 2]
AFBDTW_Dist <- dist_afbdtw(X, Y)
Gap creation
Description
This function creates a large continuous gap within a univariate signal. Gap size is defined as a percentage of input vector length. By default, the created gap starts at a random location.
Usage
gapCreation(X, rate, begin = NULL)
Arguments
X |
input vector |
rate |
size of desired gap, as a percentage of input vector size |
begin |
location of the begining of the gap (random by default) |
Value
gapCreation returns a list containing the following elements:
output_vector: output vector containing the created gap
input_vector: original vector used as input
begin_gap: index of the begining of the gap
rate: size of the created gap in percentage of the input vector length
gap_size: length of the created gap
Author(s)
Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault
Examples
data(dataDTWBI)
X <- dataDTWBI[, 1]
rate <- 0.1
output <- gapCreation(X, rate)
plot(output$input_vector, type = "l", col = "red", lwd = 2)
lines(output$output_vector, lty = "dashed", lwd = 2)
Local derivative estimate to compute DDTW
Description
This function estimates the local derivative of a vector. It can be used as an input in dtw() function (package dtw) to align two univariate signals.
Usage
local.derivative.ddtw(X)
Arguments
X |
input vector from which local derivative has to be calculated |
Author(s)
Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault
Examples
data(dataDTWBI)
X <- dataDTWBI[, 1]
local.derivative.ddtw(X)
# Plot
plot(X, type = "b", ylim = c(-1, 1))
lines(local.derivative.ddtw(X), col = "red")
DTW-based methods for univariate signals
Description
Finds the optimal alignment between two univariate time series based on DTW methods.
Usage
minCost(X, Y, method, ...)
Arguments
X |
query vector |
Y |
response vector |
method |
"DTW", "DDTW", "AFBDTW", "DTW-D" |
... |
additional arguments from functions dtw or dist_afbdtw |
Author(s)
Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault
Examples
data(dataDTWBI)
X <- dataDTWBI[, 1] ; Y <- dataDTWBI[, 2]
# Plot query and reference
plot(X, type = "l", ylim = c(-5,3))
lines(1:length(X), Y, col = "red")
#= Align signals using DTW
align_dtw <- minCost(X, Y, method = "DTW")
#= Align signals using DDTW
align_ddtw <- minCost(X, Y, method = "DDTW")
#= Align signals using AFBDTW
align_afbdtw <- minCost(X, Y, method = "AFBDTW")
#= Align signals using DTW-D
align_dtwd <- minCost(X, Y, method = "DTW-D")
#= Plots
library(dtw)
dtwPlotTwoWay(d = align_dtw, xts <- X, yts = Y, main = "DTW")
dtwPlotTwoWay(d = align_ddtw, xts <- X, yts = Y, main = "DDTW")
dtwPlotTwoWay(d = align_afbdtw, xts <- X, yts = Y, main = "AFBDTW")
dtwPlotTwoWay(d = align_dtwd, xts <- X, yts = Y, main = "DTW-D")
#= Compare cost of each method
comparative_cost <- matrix(c(align_dtw$normalizedDistance,
align_ddtw$normalizedDistance,
align_afbdtw$normalizedDistance,
align_dtwd$normalizedDistance), ncol = 4)
colnames(comparative_cost) <- c("DTW", "DDTW", "AFBDTW", "DTW-D")
comparative_cost