LSTM is one of the most powerful algorithm out there when dealing with time series forecasting. The code which may span for several lines while dealing with models such as ARIMA can be completed within couple of lines using LSTM. If you want to demystify the mystery behind LSTM, I would suggest you take a look at my previous article, where the concept underlying the working of LSTM is explained. However you can only give life to a product if you are able to code it out.
This article explains the implementation of unidirectional and bidirectional LSTM network using keras. The first step when dealing with any algorithm is data-preprocessing and the same principle apply to LSTM as well. We have to feed the input in a way understandable by LSTM. So let’s dive in to code
The first step is to import the necessary libraries. Libraries makes our life easier as we don’t have to implement it from scratch.
import pandas as pd
from pandas import DataFrame
import numpy as np
from numpy import hstack
import tensorflow as tf
from tensorflow import keras
from sklearn import preprocessing
If you get an error while importing the libraries , you have to first install it. Once the importing of libraries is done, we can load our data-set
I am loading my data-set and printing the first 5 values of it using df.head() , if you want to print the last 5 values of a data-set you can use df.tail()
df.columns = ['Date', 'Rainfall'] #naming of columns
The dataset consist of two columns as date and rainfall, This is a case of univariate time series forecasting as there is only one feature which is rainfall.We usually divide the data-set in to train and test I am applying the same here.
min_max_scaler = preprocessing.MinMaxScaler()
x_scaled =min_max_scaler.fit_transform(X_train.values.reshape(-1,1)) X_train = pd.DataFrame(x_scaled)
x_scaled1 =min_max_scaler.fit_transform(X_test.values.reshape(-1,1)) X_test = pd.DataFrame(x_scaled1)
I am doing some pre processing using min max scaler, to transform the values between 0 and 1.
in_seq = np.array([X_train])
in_seqtest1 = np.array([X_test])
in_seq = in_seq.reshape((12000, 1))
in_seqtest1 = in_seqtest1.reshape((3704, 1))
The above block of code convert it in to numpy array and reshape it.
def split_sequence(sequence, n_steps):
X, y = list(), list()
for i in range(len(sequence)):
end_ix = i + n_steps
if end_ix > len(sequence)-1:
seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
return np.array(X), np.array(y)
The above split_sequence function is the driving block as it is responsible for splitting the input in to features and target value based on the n_steps we provide.
df_test = hstack((in_seqtest1))
df_train = hstack((in_seq))
The above piece of code stack array in sequence.
n_steps = 3
# convert into input/output
X, y = split_sequence(df_train, n_steps)
X_test,y_test = split_sequence(df_test, n_steps)
I am converting the training and testing set to input/output sequence by using the split_sequence function which I have declared earlier. In this case I am using a time step of 3. So there will be 3 features for each target value. That is the n+1 value depends on n, n-1 and n-2 values, In this case the n+1 is the target value and the remaining three act as features.
model = keras.Sequential()
model.add(keras.layers.LSTM(10,activation='relu',return_sequences=True, input_shape=(n_steps, 1)))model.add(keras.layers.LSTM(10, activation='relu', input_shape=(n_steps, 1)))
The above LSTM network has two hidden layers with 10 neurons per layer, uses relu activation function. The above LSTM network is compiled using adam classifier and mse as loss.
2. Back-Propagation is very simple. Who made it Complicated ?
3. Introducing Ozlo
4. How to train a neural network to code by itself ?
We are almost done with it, now we can feed our data to the LSTM network after reshaping it.
X = X.reshape((X.shape, X.shape, 1))
X_test=X_test.reshape((X_test.shape, X_test.shape, 1))
model.fit(X, y, validation_data=(X_test,y_test),epochs=50)
We have successfully trained our model using LSTM. Now its time to check how well our model has performed.
ypred = model.predict(X)
s1.corrwith(s, axis = 0)
The above block of code calculates the correlation value.
Bidirectional LSTM :
For Bidirectional LSTM where our network learns by using forward and backward values we have to just change a block of code in unidirectional LSTM.
model = keras.Sequential()
model.add(keras.layers.Bidirectional(keras.layers.LSTM(10,activation='relu',return_sequences=True, input_shape=(n_steps, 1))))model.add(keras.layers.Bidirectional(keras.layers.LSTM(10, activation='relu', input_shape=(n_steps, 1))))
You can alter the hyper-parameters based on your data-set. This is the power of LSTM while compared to other machine learning algorithms, the LSTM algorithm does most of your job. You can try implementing LSTM on the time series forecasting problems you are working on and the results may surprise you !!
The entire source code is available in my github.