Friday, April 23, 2021
  • Setup menu at Appearance » Menus and assign menu to Top Bar Navigation
Advertisement
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
No Result
View All Result
Home Data Science

Blockdrop to Accelerate Neural Network training by IBM Research

July 10, 2020
in Data Science
Blockdrop to Accelerate Neural Network training by IBM Research
587
SHARES
3.3k
VIEWS
Share on FacebookShare on Twitter

Scaling AI with Dynamic Inference Paths in Neural Networks

Introduction

You might also like

What Does The Future Hold For the Companies Developing Mobile Apps

Cloud Computing In Healthcare: Main Problems Companies Encounter

MLOps: Comprehensive Beginner’s Guide – Data Science Central

IBM Research, with the help of the University of Texas Austin and the University of Maryland, has created a technology, called BlockDrop, that promises to speed convolutional neural network operations without any loss of fidelity.

This could further excel the use of neural nets, particularly in places with limited computing capability.

Increase in accuracy level have been accompanied by increasingly complex and deep network architectures. This presents a problem for domains where fast inference is essential, particularly in delay-sensitive and realtime scenarios such as autonomous driving, robotic navigation, or user-interactive applications on mobile devices.

Further research results show regularization techniques for fully connected layers, is less effective for convolutional layers, as activation units in these layers are spatially correlated and information can still flow through convolutional networks despite dropout.

BlockDrop method introduced by IBM Research is a complementary method to existing model compression techniques, as this form of structured dropout drops spatially correlated information. The residual blocks of a neural network kept for evaluation, can be further pruned for greater speed up.

Residual block — Building block of a Restnet, Source: here

The below figure illustrates blockdrop mechanism for a given image input to convolution network. The green regions in the 2 right side figures include the activation units which contain semantic information in the input image. The activations dropped at random is not effective in removing semantic information because nearby activations contain closely related information. The best strategy is to drop continuous regions that helps in removing certain semantic information (e.g., head or feet), and compels remaining units to learn features for classifying input image.

Source :https://papers.nips.cc/paper/8271-dropblock-a-regularization-method…

Policy Network for Dynamic Inference Paths

BlockDrop mechanism learns to dynamically choose which layers of a deep network to execute during inference so as to best reduce total computation without degrading prediction accuracy. It exploits the robustness of Residual Networks (ResNets) by dropping layers that aren’t necessary to compute to achieve the desired level of accuracy, resulting in dynamic selection of residual blocks for a given novel image. Thus it aids in:

  • Allocating system resources in a more efficient manner.
  • Facilitating further insights into ResNets, e.g., whether different blocks encode information about objects.
  • Achieving minimal block usage based on image-specific decisions to optimally drop blocks.

For example, given a pre-trained ResNet, a policy network is trained into an associative reinforcement learning setting for the dual reward of utilizing a minimal number of blocks while preserving recognition accuracy. Experiments on CIFAR and ImageNet reveal learned policies not only accelerate inference but also encode meaningful visual information. A ResNet-101 model, with this method achieves a speedup of 20% on average, going as high as 36% for some images, while maintaining the same 76.4% top-1 accuracy on ImageNet.

BlockDrop strategy learns a model, referred to as the policy network, that, given a novel input image, outputs the posterior probabilities of all the binary decisions for dropping or keeping each block in a pre-trained ResNet. The policy network is trained using curriculum learning to maximize a reward that incentivizes the use of as few blocks as possible while preserving the prediction accuracy.

In addition, the pre-trained ResNet is further jointly fine-tuned with the policy network to produce feature transformations targeted for block dropping behavior. The method represents an instantiation of associative reinforcement learning where all the decisions are taken in a single step given the context (i.e., the input instance)1. This results in lightweight policy execution and scalable to very deep networks.

Deep Learning Neural networks like a recurrent model (LSTM) could also serve as the policy network, however research findings reveal a CNN to be more efficient with similar performance.

The above Figure represents a conceptual overview of BlockDrop, that learns a policy to select the minimal configuration of blocks needed to correctly classify a given input image. The resulting instance-specific paths in the network not only reflect the image’s difficulty (easier samples use fewer blocks) but also encode meaningful visual information (patterns of blocks correspond to clusters of visual features).

The above figure depicts policy network architecture of Blockdrop. On any given new image, the policy network outputs dropping and keeping decisions for each block in a pre-trained ResNet. This final active blocks retained are used for evaluating prediction. Policy rewards account for both block usage and prediction accuracy. The policy network is further trained to optimize the expected reward with a curriculum learning strategy, and then jointly fine-tuned with the ResNet.

The above figure illustrates samples from ImageNet. The top row contains images that are correctly classified with the least number of blocks, while samples in the bottom row utilize the most blocks. Samples using fewer blocks are indeed easier to identify since they contain single frontal view objects positioned in the center, while several objects, occlusion, or cluttered background occur in samples that require more blocks.

This is based on the hypothesis that block usage is a function of instance difficulty where BlockDrop automatically learns “sorting” images into easy or hard cases.

Library and Usage

See source code and comments on the Github page, here

Conclusion

In this blog we have discussed about BlockDrop strategy aimed to speedup training of neural networks. It has the following characteristics:

  • Speed AI-based computer vision operations.
  • Approximately takes 200 times less power per pixel than comparable systems using traditional hardware.
  • Facilitates deployment of top-performing deep neural network models on mobile devices by effectively reducing the storage and computational costs of such networks.
  • Determines the minimal configuration of layers, or blocks, needed to correctly classify a given input image. Simplicity of images helps to remove more layers and save more time.
  • Application has been extended to ResNets for faster inference by selectively choosing residual blocks to evaluate in a learned and optimized manner conditioned on inputs.
  • Extensive experiments conducted on CIFAR and ImageNet, shows considerable gains over existing methods in terms of the efficiency and accuracy trade-off.

References

  1. BlockDrop: Dynamic Inference Paths in Residual Networks https://arxiv.org/pdf/1711.08393.pdf
  2. https://www.ibm.com/blogs/research/2018/12/ai-year-review/

 


Credit: Data Science Central By: Sharmistha Chatterjee

Previous Post

Emerging Job Roles for Successful AI Teams

Next Post

CyberCX scoops up Basis Networks to expand security capabilities

Related Posts

What Does The Future Hold For the Companies Developing Mobile Apps
Data Science

What Does The Future Hold For the Companies Developing Mobile Apps

April 22, 2021
Cloud Computing In Healthcare: Main Problems Companies Encounter
Data Science

Cloud Computing In Healthcare: Main Problems Companies Encounter

April 22, 2021
MLOps: Comprehensive Beginner’s Guide – Data Science Central
Data Science

MLOps: Comprehensive Beginner’s Guide – Data Science Central

April 22, 2021
What is Unsupervised learning – Data Science Central
Data Science

What is Unsupervised learning – Data Science Central

April 22, 2021
Five Steps to Building Stunning Visualizations Using Python
Data Science

Five Steps to Building Stunning Visualizations Using Python

April 22, 2021
Next Post
CyberCX scoops up Basis Networks to expand security capabilities

CyberCX scoops up Basis Networks to expand security capabilities

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

January 6, 2019
Microsoft, Google Use Artificial Intelligence to Fight Hackers

Microsoft, Google Use Artificial Intelligence to Fight Hackers

January 6, 2019

Categories

  • Artificial Intelligence
  • Big Data
  • Blockchain
  • Crypto News
  • Data Science
  • Digital Marketing
  • Internet Privacy
  • Internet Security
  • Learn to Code
  • Machine Learning
  • Marketing Technology
  • Neural Networks
  • Technology Companies

Don't miss it

Basic laws of physics spruce up machine learning
Machine Learning

Can machine learning improve debris flow warning?

April 23, 2021
58 Resources To Help Get Started With Deep Learning ( In TF ) | by Shubham Panchal | Apr, 2021
Neural Networks

58 Resources To Help Get Started With Deep Learning ( In TF ) | by Shubham Panchal | Apr, 2021

April 23, 2021
An ideal time for online events to get a makeover
Digital Marketing

What do attendees want from your presentation?: Thursday’s daily brief

April 23, 2021
SolarWinds hack analysis reveals 56% boost in command server footprint
Internet Security

SolarWinds hack analysis reveals 56% boost in command server footprint

April 22, 2021
1-Click Hack Found in Popular Desktop Apps — Check If You’re Using Them
Internet Privacy

Researchers Find Additional Infrastructure Used By SolarWinds Hackers

April 22, 2021
What Does The Future Hold For the Companies Developing Mobile Apps
Data Science

What Does The Future Hold For the Companies Developing Mobile Apps

April 22, 2021
NikolaNews

NikolaNews.com is an online News Portal which aims to share news about blockchain, AI, Big Data, and Data Privacy and more!

What’s New Here?

  • Can machine learning improve debris flow warning? April 23, 2021
  • 58 Resources To Help Get Started With Deep Learning ( In TF ) | by Shubham Panchal | Apr, 2021 April 23, 2021
  • What do attendees want from your presentation?: Thursday’s daily brief April 23, 2021
  • SolarWinds hack analysis reveals 56% boost in command server footprint April 22, 2021

Subscribe to get more!

© 2019 NikolaNews.com - Global Tech Updates

No Result
View All Result
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News

© 2019 NikolaNews.com - Global Tech Updates