Monday, March 8, 2021
  • Setup menu at Appearance » Menus and assign menu to Top Bar Navigation
Advertisement
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
No Result
View All Result
Home Neural Networks

How Dataset size and RAM choke your Deep Learning for Computer Vision | by Yefeng Xia | Aug, 2020

September 5, 2020
in Neural Networks
How Dataset size and RAM choke your Deep Learning for Computer Vision | by Yefeng Xia | Aug, 2020
585
SHARES
3.3k
VIEWS
Share on FacebookShare on Twitter

Dealing with large image datasets, computer memory can be easily overloaded. Some people don’t have an idea about how large an image dataset could be. The MNIST dataset, although each handwritten digit is in 28×28, is composed of a training set of 60,000 examples, and a test set of 10,000 examples. It doesn’t require too much hard drive capacity for the downloaded dataset. But when we read the dataset into Numpy array, too much memory (RAM) will be taken. Instead of an array output, an error message “run out of memory” appears on the screen. What is worse, with the development of Data Science, the size of datasets for our researches is growing up. The COCO dataset, the Cityscapes dataset, etc. need much larger both hard drive capacity and memory.

It seems that we have to buy better and more expensive equipment to struggle with limited computer memory. Otherwise, we can’t proceed these huge image datasets.

You might also like

Deploy AI models -Part 3 using Flask and Json | by RAVI SHEKHAR TIWARI | Feb, 2021

Labeling Service Case Study — Video Annotation — License Plate Recognition | by ByteBridge | Feb, 2021

5 Tech Trends Redefining the Home Buying Experience in 2021 | by Iflexion | Mar, 2021

Machine Learning Jobs

On the other hand, deep learning algorithms require a great many computer calculations, which could also run out of computer memory. The classification, detection, segmentation algorithms of Computer Vision with DNN handle with enormous data volume. The more train data, the better our result. Though we have access to big datasets, like Pascal VOC, COCO, Cityscapes, which are often and free for everyone. Our poor RAM doesn’t allow our processing of huge data. Either the dataset size or RAM chokes my deep learning like a force choke👹.

star wars movie scene

For the case, we can read images from a specific directory however not all at once but a certain number of images at each time. The idea is something like batch size in Keras/Tensorflow. The difference is with batch_size setting all images loaded in memory in advance, which brings unexpected memory load. We import an image dataset and save the image name/address in a list/dictionary, in order that we can call up the images when we want to read them into DNN for training. The image data that have been trained can be deleted as soon as possible. Therewith, only one subset of the dataset can exist in memory at each stage of training.

flow chart of reading and processing subset of a dataset

with this method, we can not only reduce the memory load for DNN training at each time but improve the performance of the model. The reason behind it is also like the idea of why we prefer using small batch_size for DNN training in order to get a better generalization. The number of subsets of the original dataset can be regarded as a new hyperparameter that is worthy of further study. But for the current objective, we can break the limitation from the dataset size of RAM with the introduced method.

1. Microsoft Azure Machine Learning x Udacity — Lesson 4 Notes

2. Fundamentals of AI, ML and Deep Learning for Product Managers

3. Roadmap to Data Science

4. Work on Artificial Intelligence Projects

It’s exciting, we don’t need to update the old equipment or give up with deep learning for computer vision😬 👻 🔮.

The following code is for image segmentation on Pascal VOC 2012 which consists of 2913 segment images. The model is built by Keras. We divide 2913 images into 20 subsets. For each subset, there are 5 epochs with the same hyperparameters, such as optimizer, learning rate, batch size, etc. The function “getSegmentationArr” is a custom data preprocessing for semantic segmentation with Fully Convolutional Networks (FCN). The gc.collect() is a function from Garbage Collector interface, which releases unreferenced memory. It’s planed that every time there are 1000 images trained and after training a new subset is adopted for the next training. The model will be updated after each subset’s training and epoch. The starting image index from a subset is incremented by 100 each time. For almost 2900 images, and the last subset contains also 1000 images and starts with image index 1900 and ends with image index 2900. Therewith we can loop 20 times training to get full use of the image dataset. Sorry🙃 that I just removed the last 13 images for training. I think it’s okay now that we have gotten a good model.

from keras import optimizers
from keras.callbacks import ModelCheckpoint, EarlyStopping
checkpoint =
ModelCheckpoint(“FCN_8.h5”,monitor=’val_acc’,verbose=1,save_best_only=True,save_weights_only= False,mode =’auto’,period=1)
early =EarlyStopping(monitor=’val_loss’,min_delta=0,patience=3,verbose=1,mode=’auto’)
round=0
for round in range(20):
X_batch = X[0+round*100:1000+round*100]
Y = []
for seg in segmentations[0+round*100:1000+round*100]:
Y.append( getSegmentationArr( dir_seg + seg ,nClasses ,output_width ,output_height ) )
Y = np.asarray(Y)
train_rate = 0.9
index_train =
np.random.choice(X_batch.shape[0],int(X_batch.shape[0]*train_rate),replace=False)
index_test = list(set(range(X_batch.shape[0])) — set(index_train))
X_batch, Y = shuffle(X_batch,Y)
X_train, y_train = X_batch[index_train],Y[index_train]
X_test, y_test = X_batch[index_test],Y[index_test]
adam = optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, amsgrad=False)
model.compile(loss=’categorical_crossentropy’,optimizer=adam,metrics=[‘accuracy’])
hist1 = model.fit(X_train,y_train,validation_data=(X_test,y_test),batch_size=32,epochs=5,callbacks=[checkpoint,early],verbose=1)
if round != 19:
del X_batch, Y,index_test, index_train, X_train, y_train,X_test, y_test
gc.collect()

After the 20 loops, we get the final FCN segmentation model, which has validation acc 0.9224 and mean IoU 0.737. For a simple FCN, it’s not bad.

FCN model output compared with ground truth

Deng, Li. “The mnist database of handwritten digit images for machine learning research [best of the web].” IEEE Signal Processing Magazine 29.6 (2012): 141-142.

Cordts, Marius, et al. “The cityscapes dataset for semantic urban scene understanding.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.

Lin, Tsung-Yi, et al. “Microsoft coco: Common objects in context.” European conference on computer vision. Springer, Cham, 2014.

Everingham, Mark, et al. “The pascal visual object classes (voc) challenge.” International journal of computer vision 88.2 (2010): 303–338.

Jonathan Long ; Evan Shelhamer ; Trevor Darrell: Fully Convolutional Networks for Semantic Segmentation (2015)

Credit: BecomingHuman By: Yefeng Xia

Previous Post

Pentagon says it plans to stick with Microsoft as JEDI cloud contract winner

Next Post

NASA researchers use machine learning to better predict if a hurricane will rapidly intensify

Related Posts

Deploy AI models -Part 3 using Flask and Json | by RAVI SHEKHAR TIWARI | Feb, 2021
Neural Networks

Deploy AI models -Part 3 using Flask and Json | by RAVI SHEKHAR TIWARI | Feb, 2021

March 6, 2021
Labeling Service Case Study — Video Annotation — License Plate Recognition | by ByteBridge | Feb, 2021
Neural Networks

Labeling Service Case Study — Video Annotation — License Plate Recognition | by ByteBridge | Feb, 2021

March 6, 2021
5 Tech Trends Redefining the Home Buying Experience in 2021 | by Iflexion | Mar, 2021
Neural Networks

5 Tech Trends Redefining the Home Buying Experience in 2021 | by Iflexion | Mar, 2021

March 6, 2021
Labeling Case Study — Agriculture— Pigs’ Productivity, Behavior, and Welfare Image Labeling | by ByteBridge | Feb, 2021
Neural Networks

Labeling Case Study — Agriculture— Pigs’ Productivity, Behavior, and Welfare Image Labeling | by ByteBridge | Feb, 2021

March 5, 2021
8 concepts you must know in the field of Artificial Intelligence | by Diana Diaz Castro | Feb, 2021
Neural Networks

8 concepts you must know in the field of Artificial Intelligence | by Diana Diaz Castro | Feb, 2021

March 5, 2021
Next Post
NASA researchers use machine learning to better predict if a hurricane will rapidly intensify

NASA researchers use machine learning to better predict if a hurricane will rapidly intensify

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

January 6, 2019
Microsoft, Google Use Artificial Intelligence to Fight Hackers

Microsoft, Google Use Artificial Intelligence to Fight Hackers

January 6, 2019

Categories

  • Artificial Intelligence
  • Big Data
  • Blockchain
  • Crypto News
  • Data Science
  • Digital Marketing
  • Internet Privacy
  • Internet Security
  • Learn to Code
  • Machine Learning
  • Marketing Technology
  • Neural Networks
  • Technology Companies

Don't miss it

Here’s an adorable factory game about machine learning and cats
Machine Learning

Here’s an adorable factory game about machine learning and cats

March 8, 2021
How Machine Learning Is Changing Influencer Marketing
Machine Learning

How Machine Learning Is Changing Influencer Marketing

March 8, 2021
Video Highlights: Deep Learning for Probabilistic Time Series Forecasting
Machine Learning

Video Highlights: Deep Learning for Probabilistic Time Series Forecasting

March 7, 2021
Machine Learning Market Expansion Projected to Gain an Uptick During 2021-2027
Machine Learning

Machine Learning Market Expansion Projected to Gain an Uptick During 2021-2027

March 7, 2021
Maza Russian cybercriminal forum suffers data breach
Internet Security

Maza Russian cybercriminal forum suffers data breach

March 7, 2021
Clinical presentation of COVID-19 – a model derived by a machine learning algorithm
Machine Learning

Clinical presentation of COVID-19 – a model derived by a machine learning algorithm

March 7, 2021
NikolaNews

NikolaNews.com is an online News Portal which aims to share news about blockchain, AI, Big Data, and Data Privacy and more!

What’s New Here?

  • Here’s an adorable factory game about machine learning and cats March 8, 2021
  • How Machine Learning Is Changing Influencer Marketing March 8, 2021
  • Video Highlights: Deep Learning for Probabilistic Time Series Forecasting March 7, 2021
  • Machine Learning Market Expansion Projected to Gain an Uptick During 2021-2027 March 7, 2021

Subscribe to get more!

© 2019 NikolaNews.com - Global Tech Updates

No Result
View All Result
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News

© 2019 NikolaNews.com - Global Tech Updates