Credit: BecomingHuman
Obviously, these two sentences have widely varying impacts and meanings!
This is where recurrent neural networks come into play. They attempt to retain some of the importance of sequential data.
With a Recurrent Neural Network, your input data is passed into a cell, which, along with outputting the activiation function’s output, we take that output and include it as an input back into this cell.
This is where the Long Short Term Memory (LSTM) Cell comes in. An LSTM cell looks like:
The idea here is that we can have some sort of functions for determining what to forget from previous cells, what to add from the new input data, what to output to new cells, and what to actually pass on to the next layer.
Now let’s work on applying an RNN to something simple, then we’ll use an RNN on a more realistic use-case. I am going to have us start by using an RNN to predict MNIST, since that’s a simple dataset, already in sequences, and we can understand what the model wants from us relatively easily.
Trending AI Articles:
1. Making a Simple Neural Network
2. Google will beat Apple at its own game with superior AI
3. The AI Job Wars: Episode I
4. Introducing Open Mined: Decentralised AI
We’ll begin our basic RNN example with the imports we need:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, LSTM
The type of RNN cell that we’re going to use is the LSTM cell. Layers will have dropout, and we’ll have a dense layer at the end, before the output layer.
mnist = tf.keras.datasets.mnist # mnist is a dataset of 28x28 images of handwritten digits and their labels
(x_train, y_train),(x_test, y_test) = mnist.load_data() # unpacks images to x_train/x_test and labels to y_train/y_testx_train = x_train/255.0
x_test = x_test/255.0print(x_train.shape)
print(x_train[0].shape)
model = Sequential()
model.add(LSTM(128, input_shape=(x_train.shape[1:]), activation='relu', return_sequences=True))
model.add(Dropout(0.2))model.add(LSTM(128, activation='relu'))
model.add(Dropout(0.1))model.add(Dense(32, activation='relu'))
model.add(Dropout(0.2))model.add(Dense(10, activation='softmax'))
This should all be straightforward, where rather than Dense or Conv, we’re just using LSTM as the layer type. The only new thing is return_sequences
. This flag is used for when you’re continuing on to another recurrent layer. If you are, then you want to return sequences. If you’re not going to another recurrent-type of the layer, then you don’t set this to true.
opt = tf.keras.optimizers.Adam(lr=0.001, decay=1e-6)model.compile(
loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'],
)model.fit(x_train,
y_train,
epochs=3,
validation_data=(x_test, y_test))
Train on 60000 samples, validate on 10000 samples
Epoch 1/3
60000/60000 [==============================] - 189s 3ms/step - loss: 0.5922 - acc: 0.8056 - val_loss: 0.1395 - val_acc: 0.9601
Epoch 2/3
60000/60000 [==============================] - 186s 3ms/step - loss: 0.1686 - acc: 0.9538 - val_loss: 0.0797 - val_acc: 0.9751
Epoch 3/3
60000/60000 [==============================] - 187s 3ms/step - loss: 0.1137 - acc: 0.9692 - val_loss: 0.0638 - val_acc: 0.9815
Let’s stick with saying we’re trying to predict the price of Litecoin. So we need to grab the future price of Litecoin, then determine if it’s higher or lower to the current price.
We need the data. Here’s the data: Cryptocurrency pricing training dataset. Download that, then extract it in your project dir. You should have a directory called crypto_data
and inside of it should be four csv files. To read these files in and manipulate them, we’re going to use a library called pandas
. Open a console/terminal and do pip install pandas
.
import pandas as pddf = pd.read_csv("crypto_data/LTC-USD.csv", names=['time', 'low', 'high', 'open', 'close', 'volume'])
print(df.head())
time low high open close volume
0 1528968660 96.580002 96.589996 96.589996 96.580002 9.647200
1 1528968720 96.449997 96.669998 96.589996 96.660004 314.387024
2 1528968780 96.470001 96.570000 96.570000 96.570000 77.129799
3 1528968840 96.449997 96.570000 96.570000 96.500000 7.216067
4 1528968900 96.279999 96.540001 96.500000 96.389999 524.539978
This is the data for LTC-USD
, which is just the USD value for Litecoin. What we want to do is somehow take the close
and volume
from here, and combine it with the other 3 cryptocurrencies.
main_df = pd.DataFrame() # begin emptyratios = ["BTC-USD", "LTC-USD", "BCH-USD", "ETH-USD"] # the 4 ratios we want to consider
for ratio in ratios: # begin iteration
print(ratio)
dataset = f'training_datas/{ratio}.csv' # get the full path to the file.
df = pd.read_csv(dataset, names=['time', 'low', 'high', 'open', 'close', 'volume']) # read in specific file# rename volume and close to include the ticker so we can still which close/volume is which:
df.rename(columns={"close": f"{ratio}_close", "volume": f"{ratio}_volume"}, inplace=True)df.set_index("time", inplace=True) # set time as index so we can join them on this shared time
df = df[[f"{ratio}_close", f"{ratio}_volume"]] # ignore the other columns besides price and volumeif len(main_df)==0: # if the dataframe is empty
main_df = df # then it's just the current df
else: # otherwise, join this data to the main one
main_df = main_df.join(df)main_df.fillna(method="ffill", inplace=True) # if there are gaps in data, use previously known values
main_df.dropna(inplace=True)
print(main_df.head()) # how did we do??
BTC-USD_close BTC-USD_volume LTC-USD_close LTC-USD_volume
time
1528968720 6487.379883 7.706374 96.660004 314.387024
1528968780 6479.410156 3.088252 96.570000 77.129799
1528968840 6479.410156 1.404100 96.500000 7.216067
1528968900 6479.979980 0.753000 96.389999 524.539978
1528968960 6480.000000 1.490900 96.519997 16.991997BCH-USD_close BCH-USD_volume ETH-USD_close ETH-USD_volume
time
1528968720 870.859985 26.856577 486.01001 26.019083
1528968780 870.099976 1.124300 486.00000 8.449400
1528968840 870.789978 1.749862 485.75000 26.994646
1528968900 870.000000 1.680500 486.00000 77.355759
1528968960 869.989990 1.669014 486.00000 7.503300
Next, we need to create a target. To do this, we need to know which price we’re trying to predict. We also need to know how far out we want to predict. We’ll go with Litecoin for now. Knowing how far out we want to predict probably also depends how long our sequences are. If our sequence length is 3 (so…3 minutes), we probably can’t easily predict out 10 minutes. If our sequence length is 300, 10 might not be as hard. I’d like to go with a sequence length of 60, and a future prediction out of 3. We could also make the prediction a regression question, using a linear activation with the output layer, but, instead, I am going to just go with a binary classification.
If price goes up in 3 minutes, then it’s a buy. If it goes down in 3 minutes, not buy/sell. With all of that in mind, I am going to make the following constants:
SEQ_LEN = 60 # how long of a preceeding sequence to collect for RNN
FUTURE_PERIOD_PREDICT = 3 # how far into the future are we trying to predict?
RATIO_TO_PREDICT = "LTC-USD"
Next, I am going to make a simple classification function that we’ll use to map in a moment:
def classify(current, future):
if float(future) > float(current):
return 1
else:
return 0
This function will take values from 2 columns. If the “future” column is higher, great, it’s a 1 (buy). Otherwise it’s a 0 (sell). To do this, first, we need a future column!
main_df['future'] = main_df[f'{RATIO_TO_PREDICT}_close'].shift(-FUTURE_PERIOD_PREDICT)
shift
will just shift the columns for us, a negative shift will shift them “up.” So shifting up 3 will give us the price 3 minutes in the future, and we’re just assigning this to a new column.
Now that we’ve got the future values, we can use them to make a target using the function we made above.
main_df['target'] = list(map(classify, main_df[f'{RATIO_TO_PREDICT}_close'], main_df['future']))
The above can be confusing. Start by ignoring the list()
part, this is just at the very end, which I’ll explain in a minute.
The map()
is used to map a function. The first parameter here is the function we want to map (classify
), then the next ones are the parameters to that function. In this case, the current close price, and then the future price.
The map
part is what allows us to do this row-by-row for these columns, but also do it quite fast. The list part converts the end result to a list, which we can just set as a column.
Great, let’s check out the data:
print(main_df.head())
BTC-USD_close BTC-USD_volume LTC-USD_close LTC-USD_volume
time
1528968720 6487.379883 7.706374 96.660004 314.387024
1528968780 6479.410156 3.088252 96.570000 77.129799
1528968840 6479.410156 1.404100 96.500000 7.216067
1528968900 6479.979980 0.753000 96.389999 524.539978
1528968960 6480.000000 1.490900 96.519997 16.991997BCH-USD_close BCH-USD_volume ETH-USD_close ETH-USD_volume
time
1528968720 870.859985 26.856577 486.01001 26.019083
1528968780 870.099976 1.124300 486.00000 8.449400
1528968840 870.789978 1.749862 485.75000 26.994646
1528968900 870.000000 1.680500 486.00000 77.355759
1528968960 869.989990 1.669014 486.00000 7.503300future target
time
1528968720 96.389999 0
1528968780 96.519997 0
1528968840 96.440002 0
1528968900 96.470001 1
1528968960 96.400002 0
Don’t forget to give us your 👏 !
Credit: BecomingHuman By: Vatsal Raval