Deep Learning for Time Series Classification (InceptionTime)

2020-02-03 17:02:30

Source: https://towardsdatascience.com/deep-learning-for-time-series-classification-inceptiontime-245703f422db

Index

Motivation
Machine Learning for Time Series Classification
Best Deep Learning practices for Time Series Classification: InceptionTime
Understanding InceptionTime
Conclusion

1. Motivation

Time series data have always been of major interest to financial services, and now with the rise of real-time applications, other areas such as retail and programmatic advertising are turning their attention to time-series data driven applications. In the last couple of years, several key players in cloud services, such as Apache Kafka and Apache Spark, have released new products for processing time series data. It is therefore of great interest to understand the role and potentials of Machine Learning (ML) in this rising field.

2].

2. Machine Learning for Time Series Classification

Defining the problem:

TSC is the area of ML interested in learning how to assign labels to time series. To be more concrete, we are interested in training an ML model which when fed with a series of data points indexed in time order (e.g. the historical data of a financial asset), it outputs labels (e.g. the industry sector of the asset).

{y¹, … , yᵏ}.

Do we really need DL?

It is always important to remind ourselves that DL is nothing but a set of tools for solving problems, and although DL can be very powerful, that doesn’t mean that we should blindly apply DL techniques to every single problem out there. After all, training and tuning a neural network can be very time-consuming so it is always a good practice to test the performance of other ML models and then seek for any potential shortcomings.

Oftentimes the nature of a problem is determined by the data itself; in our case, the way one chooses to process and classify a time series depends highly on the length and statistics of the data. That being said, let us run a quick dimensional analysis to estimate the complexity of our problem.

the length of the time series can really hurt the computational speed. However for certain types of data, this problem can be alleviated without digging into sophisticated machine learning models such as deep neural networks.

ω.

Deep Learning for Time Series Classification (InceptionTime)

Fig. 2: The Fourier series expansion of a square wave (red line). Here I present only the first three modes (blue dashed lines) and their addition (green line). Hopefully, it is not hard to see that by adding the next modes the series quickly converges to a square wave.

By taking the linear sum of these signals, we can reconstruct our original signal:

Square Wave(t) = W¹ ⋅ sin(f₁ t) + W² ⋅ sin(f₂ t) + W³ ⋅ sin(f₃ t)+ …

weight that each mode contributes to the square wave.

T time-ordered data points, can also be represented by a weight vector in the space spanned by the three elementary frequency modes:

(W¹, W², W³, …) .

T-dimensional representation of our time series data, to a number of dimensions (in Fourier space) that makes our classification problem computationally trackable. Overall, we can apply Fourier transformation during the data pre-processing phase to convert the input time series into weight vectors, and thereafter proceed by building our classification model (e.g. a 1-nearest neighbors classifier). Working with such “well-behaved” time series we can achieve high performance without the use of DL.

letting the model learn how to process time series data on its own is a more promising solution when dealing with unstructured noisy data.

3. Best DL practices for TSC: InceptionTime

InceptionTime’s high accuracy together with its scalability renders it the perfect candidate for product development!

To this end, let us present the most important components of InceptionTime and how these are implemented in Keras.

3.1 The Input Layer

ᵀ ).

ʲ ∈ ℝᵐ.

X= (X¹, … , Xᵀ ).

input_shape = (T, m).

3.2 The Inception Module

inception module, shown in the figure below:

Fig. 3: The inception module of InceptionTime. The first number in the boxes indicates the kernel size while the second indicates the size of the stride. “(S)” specifies the type of padding, i.e. “same”.

This consists of the following layers:

bottleneck layer to reduce the dimensionality (i.e. the depth) of the inputs. This cuts the computational cost and the number of parameters, speeding up training and improving generalization.
one-dimensional convolutional layers of kernel size 10, 20 and 40.
bottleneck layer.
depth concatenation layer where the outputs of the four convolutional layers at step 2 are concatenated along the depth dimension.

filters.

Keras Implementation

3.3 The Inception Network

7]. In particular, the network consists of a series of Inception modules followed by a Global Average Pooling layer and a Dense layer with a softmax activation function.

Fig. 4: The Inception network for time series classification.

residual connections at every third inception module.

Fig. 5: Residual Connections in the Inception network.

Keras Implementation

3.4 InceptionTime: a neural network ensemble for TSC

Github.

4. Understanding InceptionTime

As it was mentioned earlier, InceptionTime was primarily inspired by CNNs for computer vision problems, and we, therefore, expect our model to learn features in a similar fashion. For example, in image classification, the neurons in the bottom layers learn to identify low-level (local) features such as lines, while the neurons in higher layers learn to detect high-level (global) features such as shapes (e.g. eyes). Similarly, we expect the bottom-layer neurons of InceptionTime to capture the local structure of a time series such as lines and curves, and the top-layer neurons to identify various shape patterns such as “valleys” and “hills”.

Fig. 6: The receptive field of neurons in filters.

receptive field of that particular neuron. In object recognition, larger receptive fields are used to capture more context. It is therefore natural to pick larger receptive fields when working with very long time series data so that our InceptionTime will learn to detect larger patterns.

5. Conclusion

An important aspect in TSC is the length of the time series as this can slow down training. Although there are various mathematical processing schemes that one can follow to tackle such a problem, InceptionTime is the leading algorithm for TSC, especially for long noisy time series data.

O(n ⋅ T)! Overall, InceptionTime has put TSC problems on the same footing with image classification problems, and it is therefore exciting to explore its different applications in the industry sector.

In an upcoming post, I will discuss how I used InceptionTime to classify and cluster financial assets based on different attributes, such as industry sector/class, location and performance.

References

Learning Internal Representations by Error Propagation

Deep learning for time series classification: a review

The Great Time Series Classification Bake Off: An Experimental Evaluation of Recently Proposed Algorithms. Extended Version

HIVE-COTE: The Hierarchical Vote Collective of Transformation-based Ensembles for Time Series Classification

UCR Time Series Classification Archive

InceptionTime: Finding AlexNet for Time Series Classification

Going deeper with convolutions (GoogleNet)

Deep Neural Network Ensembles for Time Series Classification