A Deep Autoencoder based Outlier Detection for Time Series 2018... · 2018-10-26 · A Deep...

A Deep Autoencoder based Outlier Detection for Time Series

Jin Wang1, a, Fang Miao2, b, *, Lei You1, c, and Wenjie Fan1, d 1School of Information Science & Technology, Chengdu University, 2025 Chengluo Rd., Chengdu,

Sichuan, China 2Institute of Big Data, Chengdu University, 2025 Chengluo Rd., Chengdu, Sichuan, China

[email protected], [email protected], [email protected] *Corresponding Author: Fang Miao

Keywords: outlier detection, time series, variation auto-encoder, machine learning, virtual assets.

Abstract: Time series outlier detection is an important topic in data mining, having significant applications in reality. Due to the complexity and dynamics of time series, it is quite difficult to detect outlier in time series. Particularly, influenced by outside factors, time series are usually unpredictable, accompanied with concept drift. Recently, recurrent neural network has been used to identify time series outlier, and demonstrated great potential. However, RNN usually uses deterministic state transition structure, which cannot characterize the variability of high-dimensional time series. This paper proposes to incorporate latent variables into RNN, aiming to catch the time series variability as much as possible. In particular, our method combines RNN and variation auto-encoder framework. We evaluate our method with several real datasets, and demonstrate that our method has superior detecting performance.

1. Introduction

Outlier detection is a quite important topic of machine learning and data mining. It aims to find data patterns that deviate from the most others. A lot of research has focused on this topic [1]. Recently, with the emergence of Internet of Things, data stream mining has become more important. Especially, how to find outliers in data stream is quite critical. It has many applications in healthcare [2], financial fraud detection [3], and even video event detection [4]. Since data stream is usually dynamic and endless, mining outliers from data stream is difficult.

Some methods have been proposed to identify outlier from data stream [5] by combining sliding window and K-Nearest Neighbor. However, these methods usually need many hyper parameters like the threshold k in KNN. Recently, RNN has been a promising deep learning method for handling time series. In particular, RNN can use deterministic transition function to predict the following data.

2018 3rd International Conference on Computer Science and Information Engineering (ICCSIE 2018)

Published by CSP © 2018 the Authors 294

However, deterministic transition function usually could not characterize the randomness of time series, which is actually quite normal in time series due to environmental interference.

This paper proposes to identify outliers in data stream with RNN. In order to characterize the randomness of time series, we further incorporate latent variables into RNN based variation auto-encoder framework. Finally, we detect outliers according to the reconstruction probability. The main contribution of this paper is described. 1) This paper proposes to detect outliers in data stream with RNN method; 2) This paper further combines RNN with variation auto-encoder framework to characterize the randomness of time series.

This paper is organized as follows. Section 2 demonstrates the outlier detection problem, and Section 3 shows our algorithm. Section 4 gives the evaluation of our algorithm, and related work is shown in Section 5. Last, Section 6 concludes this paper and show the future work.

2. Problem Demonstration

This paper focuses on outlier detection for time series. We refer outlier to finding unexpected behaviors that do not conform to previous observed pattern [6]. Assuming time series as , t denotes the index of data points, and denotes each data point with N dimensions. We assume could be approximately reconstructed with previous data, i.e., , as Eq. (1).

(1)

In which, the function could be any linear or complex function. Actually, Eq. (1) makes sense, and has been widely used in machine learning. Thus, outlier can be obtained with Eq. (2).

(2)

If , is regarded as outlier. denotes a hyper parameter given manually. denotes the reconstruction error function, which could be L2 normal.

3. Algorithm Design

This paper combines RNN and auto-encoder to model time series, and identifies outlier based on reconstruction error. In the following sections, we briefly demonstrate RNN and auto-encoder technique, respectively. Then, we introduce our algorithm.

Recurrent neural network (RNN) is a feed-forward neural network, which converts input into output through learning a fixed sequence of non-linear transformations. Thus, RNN could perform sequence recognition and reproduction. In particular, Multi-Layer Perceptron (MLP) is an often used module in RNN. A MLP is usually parameterized with , and can be represented as Eq. (3) and Eq. (4).

(3)

(4)

In which, , are the parameters for and . Softmax function is used at the final layer of RNN.

Recently, Variation Auto-Encoder (VAEs) [7, 8] is an effective deep generative model for recovering complex multimodal distributions over a data space. Particularly, it introduces a set of latent random variables z, which are designed to capture the variations in the observed variables x. VAE consists of two modules, i.e., encoder and decoder. Encoder is used to model the latent space

295

from input data, and decoder is exploited to approximate input from latent space. For the encoder, the conditional p(x|z), also known as generative model, is an arbitrary observation model whose parameters could be obtained by a parametric function of z. In particular, VAE typically parameterizes p(x|z) with a neural network. However, the computing of decoder is difficult. The posterior p(z|x) is intractable. VAE approximate the posterior with a variational distribution q(z|x), which enables the use of the lower bound.

(5)

In which, KL(Q||P) is Kullback-Leibler divergence between two distributions Q and P. The approximate posterior q(z|x) is a Gaussian N(µ, diag(σ2)) whose mean µ and variance σ2 are the output of a highly non-linear function of x. Then, the generative model p(x|z) and inference model q(z|x) are trained jointly by maximizing the variational lower bound with respect to their parameters. Particularly, the integral with respect to q(z|x) is statistically approximated. VAE also uses a reparameterizing trick to reduce the variance of the estimated gradient as Eq. (6).

(6)

The inference model can then be trained through standard backpropagation technique for stochastic gradient descent.

Next, we demonstrate how to incorporate VAE into RNN in order to accurately detect outliers for time series. Our architecture is shown in Fig. 1, in which VAE is incorporated into each timestep. Since the latent variable z is conditional on the state variable ht−1 of an RNN, it could help VAE considering the temporal structure of the time series. Compared with standard VAE, the prior on the latent random variable , in which , which takes state variable ht−1 into consideration. Then, the generative model and inference model are described as Eq. (7) and Eq. (8), respectively.

, where (7)

, where (8)

In which, denote decoding neural network and encoding neural network. The state variable could be obtained with Eq. (9).

(9)

We revised the objective as Eq. (10).

(10)

Figure 1 The architecture of our algorithm.

296

Last, we identify outlier based on the likelihood of . Since the likelihood is calculated by the stochastic latent variables that derive the parameters from the original input, it actually represents the probability of the data being generated from a given latent variable. Because a number of samples are drawn from the latent variable distribution, this allows the reconstruction probability to take into account the variability of the latent variable space, which is one of the main distinctions between the proposed method and the autoencoder based anomaly detection. It is possible to use other distributions of the input variable space that fits for the data. For continuous data, the normal distribution can be used as in algorithm 4. For binary data, Bernoulli distributions can be used. In case of the distribution of the latent variable space, a simple continuous distribution such as an isotropic normal distribution is preferred. This can be justified by the assumption of spectral anomaly detection that the latent variable space is much simpler compared to the input variable space.

4. Evaluation

This paper evaluates the proposed method with many real datasets in terms of precision, recall, and F1 score. This study employs three benchmark datasets: KDDCUP, Thyroid, and Arrhythmia as Tab. 1. The KDDCUP dataset is from the UCI repository. Since there are some categorical features, we use one-shot representation to encode them, and get totally 120 dimensions. Besides, KDDCUP includes two classes, in which the data labeled as ‘normal’ account for 20. We regard the ‘normal’ class as outliers. The Thyroid dataset includes 3 classes in the original dataset, in which hyperfunction class is the minority. Here, the hyperfunction class is deemed as outlier.

Table 1 Description of Dataset.

#Dimension #Instance KDDCUP 120 494,021 Thyroid 6 3772 Arrhythmia 274 452

The experiment on Ubuntu platform is performed with mxnet. The results are shown in Tab. 2. As demonstrated, our method could achieve better results on KDDCUP than Thyroid and Arrhythmia.

Table 2 Evaluation Results.

Precision Recall F1 KDDCUP 0.69 0.72 0.70 Thyroid 0.13 0.24 0.17 Arrhythmia 0.36 0.33 0.34

5. Conclusion and Future Work

This paper designs an outlier detection method for time series based on auto-encoder. It incorporates VAE into each time step of RNN. In the future, we plan to study the use of Gaussian Process to further improve the performance.

Acknowledgements

This work is partially supported by the national key research and development program under grant 2016YFB0800600, and Sichuan Provincial Science & Technical grant 2018GZ0247.

297

References

[1] Hodge,V.J., Austin, J. (2004) A Survey of Outlier Detection Methodologies. Artificial Intelligence Review, 22, 85-126. [2] Wang, J., Fang H., et al. (2017) A New Mining Method to Detect Real Time Substance Use Events from Wearable Biosensor Data Stream. ICNC’17. [3] Pawar, A.D., Kalavadekar P.N., et al. A Survey on Outlier Detection Techniques for Credit Card Fraud Detection. [4] Sun, J.Y., Shao, J., et al. (2017) Abnormal event detection for video surveillance using deep one-class learning. Multimedia Tools Applications, 2017, 1-15. [5] Gupta, M., Gao, J., Aggarwal, C.C., et al. (2014) Outlier Detection for Temporal Data: A Survey. IEEE Transactions on Knowledge and Data Engineering, 26, 2250-2267. [6] Varun, C., Banerjee, A. and Kumar, V. (2009) Anomaly detection: A survey. ACM computing surveys (CSUR), 41(3), 15 [7] Kingma, D.P. and Welling, M. (2014) Auto-encoding variational bayes. Proceedings of the International Conference on Learning Representations (ICLR). [8] Rezende, D.J., Mohamed, S. and Wierstra, D. (2014) Stochastic backpropagation and approximate inference in deep generative models. Proceedings of The 31st International Conference on Machine Learning (ICML), 1278–1286.

298

Date post:	28-May-2020
Category:	Documents
Upload:	others
View:	3 times
Download:	0 times

A Deep Autoencoder based Outlier Detection for Time Series 2018... · 2018-10-26 · A Deep...

Documents