LSTM is one of the most powerful **algorithm** out there when dealing with **time series forecasting. **The code which may span for several lines while dealing with models such as** ARIMA **can be completed within couple of lines using LSTM. If you want to demystify the mystery behind LSTM, I would suggest you take a look at my previous article, where the concept underlying the working of LSTM is explained. However you can only give life to a product if you are able to code it out.

This article explains the implementation of unidirectional and bidirectional LSTM network using **keras**. The first step when dealing with any algorithm is data-preprocessing and the same principle apply to LSTM as well. We have to feed the input in a way understandable by LSTM. So let’s** **dive in to code

The first step is to import the necessary libraries. Libraries makes our life easier as we don’t have to implement it from scratch.

**import** **pandas** **as** **pd **

from **pandas** **import** DataFrame

**import** **numpy** **as** **np**

**from** **numpy** **import** hstack

**import** **tensorflow** **as** **tf**

**from** **tensorflow** **import** keras

**from** **sklearn** **import** preprocessing

If you get an error while importing the libraries , you have to first install it. Once the importing of libraries is done, we can load our data-set

`df=pd.read_excel('.......xlsx')`

df.head()

I am loading my data-set and printing the first 5 values of it using **df.head()** , if you want to print the last 5 values of a data-set you can use **df.tail()**

`df.columns = ['Date', 'Rainfall'] #naming of columns`

train=df[0:12000]

test=df[12000:15704]

X_train=train['Rainfall']

X_test=test['Rainfall']

The dataset consist of two columns as date and rainfall, This is a case of univariate time series forecasting as there is only one feature which is *rainfall*.We usually divide the data-set in to train and test I am applying the same here.

`min_max_scaler = preprocessing.MinMaxScaler() `

x_scaled =min_max_scaler.fit_transform(X_train.values.reshape(-1,1)) X_train = pd.DataFrame(x_scaled)

x_scaled1 =min_max_scaler.fit_transform(X_test.values.reshape(-1,1)) X_test = pd.DataFrame(x_scaled1)

I am doing some pre processing using min max scaler, to transform the values between 0 and 1.

`in_seq = np.array([X_train[0]])`

in_seqtest1 = np.array([X_test[0]])in_seq = in_seq.reshape((12000, 1))

in_seqtest1 = in_seqtest1.reshape((3704, 1))

The above block of code convert it in to numpy array and reshape it.

**def** split_sequence(sequence, n_steps):

X, y = list(), list()

**for** i **in** range(len(sequence)):

end_ix = i + n_steps

**if** end_ix > len(sequence)-1:

**break** seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]

X.append(seq_x)

y.append(seq_y)

**return** np.array(X), np.array(y)

The above **split_sequence** function is the driving block as it is responsible for splitting the input in to features and target value based on the **n_steps **we provide.

`df_test = hstack((in_seqtest1))`

df_train = hstack((in_seq))

The above piece of code stack array in sequence.

`n_steps = 3`

*# convert into input/output*

X, y = split_sequence(df_train, n_steps)

X_test,y_test = split_sequence(df_test, n_steps)

I am converting the training and testing set to input/output sequence by using the **split_sequence** function which I have declared earlier. In this case I am using a time step of 3. So there will be **3 features** for each target value. That is the n+1 value depends on n, n-1 and n-2 values, In this case the n+1 is the target value and the remaining three act as features.

model = keras.Sequential()

model.add(keras.layers.LSTM(10,activation='relu',return_sequences=True, input_shape=(n_steps, 1)))model.add(keras.layers.LSTM(10, activation='relu', input_shape=(n_steps, 1)))

model.add(keras.layers.Dense(1))

model.compile(optimizer='adam', loss='mse')

The above LSTM network has two** hidden layers** with 10 neurons per layer, uses **relu** activation function. The above LSTM network is compiled using **adam** classifier and **mse** as loss.

1. Neural networks for algorithmic trading. Multimodal and multitask deep learning

2. Back-Propagation is very simple. Who made it Complicated ?

3. Introducing Ozlo

4. How to train a neural network to code by itself ?

We are almost done with it, now we can feed our data to the LSTM network after reshaping it.

`X = X.reshape((X.shape[0], X.shape[1], 1))`

X_test=X_test.reshape((X_test.shape[0], X_test.shape[1], 1))

model.fit(X, y, validation_data=(X_test,y_test),epochs=50)

We have successfully trained our model using LSTM. Now its time to check how well our model has performed.

`ypred = model.predict(X)`

s=pd.DataFrame(ypred)

s1=pd.DataFrame(y)

s1.corrwith(s, axis = 0)ypred1=model.predict(X_test)

s=pd.DataFrame(ypred1)

s1=pd.DataFrame(y_test)

s1.corrwith(s,axis=0)

The above block of code calculates the correlation value.

## Bidirectional LSTM :

For Bidirectional LSTM where our network learns by using forward and backward values we have to just change a block of code in unidirectional LSTM.

model = keras.Sequential()

model.add(keras.layers.Bidirectional(keras.layers.LSTM(10,activation='relu',return_sequences=True, input_shape=(n_steps, 1))))model.add(keras.layers.Bidirectional(keras.layers.LSTM(10, activation='relu', input_shape=(n_steps, 1))))

model.add(keras.layers.Dense(1))

model.compile(optimizer='adam', loss='mse')

You can alter the hyper-parameters based on your data-set. This is the power of LSTM while compared to other machine learning algorithms, the LSTM algorithm does most of your job. You can try implementing LSTM on the time series forecasting problems you are working on and the results may surprise you !!

The entire source code is available in my **github****.**