Thursday, January 21, 2021
  • Setup menu at Appearance » Menus and assign menu to Top Bar Navigation
Advertisement
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
No Result
View All Result
Home Neural Networks

Convolutional Neural Networks (CNNs / ConvNets) for Visual Recognition | by Sameer Bairwa | Nov, 2020

November 24, 2020
in Neural Networks
Convolutional Neural Networks (CNNs / ConvNets) for Visual Recognition | by Sameer Bairwa | Nov, 2020
586
SHARES
3.3k
VIEWS
Share on FacebookShare on Twitter
Photo by JJ Ying on Unsplash

Convolutional Neural Networks are very similar to ordinary Neural Networks. They are made up of neurons that have learnable weights and biases. Each neuron receives some inputs, performs a dot product, and optionally follows it with a non-linearity. The whole network still expresses a single differentiable score function: from the raw image pixels on one end to class scores at the other. And they still have a loss function (e.g. SVM/Softmax) on the last (fully-connected) layer and all the tips/tricks we developed for learning regular Neural Networks still apply.

Artificial Intelligence Jobs

1. How to automatically deskew (straighten) a text image using OpenCV

You might also like

6 Major AI Use Cases In IT Operations | by Gina Shaw | Jan, 2021

Classifying employees as likely-to-quit using Tensorflow, Pandas & IBM attrition dataset | by Timilsinasandesh | Jan, 2021

Does AI Raises Security and Ethics Concerns amid Pandemic | by Divyesh Dharaiya | Jan, 2021

2. Explanation of YOLO V4 a one stage detector

3. 5 Best Artificial Intelligence Online Courses for Beginners in 2020

4. A Non Mathematical guide to the mathematics behind Machine Learning

So What’s New in ConvNet

ConvNet architectures make the explicit assumption that the inputs are images, which allows us to encode certain properties into the architecture. These then make the forward function more efficient to implement and vastly reduce the number of parameters in the network.

Regular Neural Nets

Neural Networks receive an input (a single vector), and transform it through a series of hidden layers. Each hidden layer is made up of a set of neurons, where each neuron is fully connected to all neurons in the previous layer, and where neurons in a single layer function completely independently and do not share any connections. The last fully-connected layer is called the “output layer” and in classification settings, it represents the class scores.

The CIFAR–10 dataset consists of 60000 32×32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

Regular Neural Nets don’t scale well to full images. In CIFAR-10, images are only of size 32x32x3 (32 wide, 32 high, 3 color channels), so a single fully-connected neuron in a first hidden layer of a regular Neural Network would have 32*32*3 = 3072 weights. This amount still seems manageable, but clearly, this fully-connected structure does not scale to larger images. For example, an image of a more respectable size, e.g. 200x200x3, would lead to neurons that have 200*200*3 = 120,000 weights. Moreover, we would almost certainly want to have several such neurons, so the parameters would add up quickly! Clearly, this full connectivity is wasteful and the huge number of parameters would quickly lead to overfitting.

3D volumes of neurons. Convolutional Neural Networks take advantage of the fact that the input consists of images and they constrain the architecture in a more sensible way. In particular, unlike a regular Neural Network, the layers of a ConvNet have neurons arranged in 3 dimensions: width, height, depth. (Note that the word depth here refers to the third dimension of an activation volume, not to the depth of a full Neural Network, which can refer to the total number of layers in a network.) For example, the input images in CIFAR-10 are an input volume of activations, and the volume has dimensions 32x32x3 (width, height, depth respectively). As we will soon see, the neurons in a layer will only be connected to a small region of the layer before it, instead of all of the neurons in a fully-connected manner. Moreover, the final output layer would for CIFAR-10 have dimensions 1x1x10, because by the end of the ConvNet architecture we will reduce the full image into a single vector of class scores, arranged along the depth dimension. Here is a visualization:

Left
Right

Left: A regular 3-layer Neural Network. Right: A ConvNet arranges its neurons in three dimensions (width, height, depth), as visualized in one of the layers. Every layer of a ConvNet transforms the 3D input volume to a 3D output volume of neuron activations. In this example, the red input layer holds the image, so its width and height would be the dimensions of the image, and the depth would be 3 (Red, Green, Blue channels).

A ConvNet is made up of Layers. Every Layer has a simple API: It transforms an input 3D volume to an output 3D volume with some differentiable function that may or may not have parameters.

As we described above, a simple ConvNet is a sequence of layers, and every layer of a ConvNet transforms one volume of activations to another through a differentiable function. We use three main types of layers to build ConvNet architectures: Convolutional Layer, Pooling Layer, and Fully-Connected Layer (exactly as seen in regular Neural Networks). We will stack these layers to form a full ConvNet architecture.

Example Architecture: Overview. We will go into more details below, but a simple ConvNet for CIFAR-10 classification could have the architecture [INPUT — CONV — RELU — POOL — FC]. In more detail:

  • INPUT [32x32x3] will hold the raw pixel values of the image, in this case, an image of width 32, height 32, and with three color channels R, G, B.
  • CONV layer will compute the output of neurons that are connected to local regions in the input, each computing a dot product between their weights and a small region they are connected to in the input volume. This may result in volume such as [32x32x12] if we decided to use 12 filters.
  • RELU layer will apply an elementwise activation function, such as the max(0,x)max(0,x) thresholding at zero. This leaves the size of the volume unchanged ([32x32x12]).
  • POOL layer will perform a downsampling operation along the spatial dimensions (width, height), resulting in volume such as [16x16x12].
  • FC (i.e. fully-connected) layer will compute the class scores, resulting in a volume of size [1x1x10], where each of the 10 numbers corresponds to a class score, such as among the 10 categories of CIFAR-10. As with ordinary Neural Networks and as the name implies, each neuron in this layer will be connected to all the numbers in the previous volume.

In this way, ConvNets transform the original image layer by layer from the original pixel values to the final class scores. Note that some layers contain parameters and others don’t. In particular, the CONV/FC layers perform transformations that are a function of not only the activations in the input volume but also of the parameters (the weights and biases of the neurons). On the other hand, the RELU/POOL layers will implement a fixed function. The parameters in the CONV/FC layers will be trained with gradient descent so that the class scores that the ConvNet computes are consistent with the labels in the training set for each image.

Summary:

  • A ConvNet architecture is in the simplest case a list of Layers that transform the image volume into an output volume (e.g. holding the class scores)
  • There are a few distinct types of Layers (e.g. CONV/FC/RELU/POOL are by far the most popular)
  • Each Layer accepts an input 3D volume and transforms it into an output 3D volume through a differentiable function
  • Each Layer may or may not have parameters (e.g. CONV/FC do, RELU/POOL don’t)
  • Each Layer may or may not have additional hyperparameters (e.g. CONV/FC/POOL do, RELU doesn’t)

The activations of an example ConvNet architecture. The initial volume stores the raw image pixels (left) and the last volume stores the class scores (right). Each volume of activations along the processing path is shown as a column. Since it’s difficult to visualize 3D volumes, we lay out each volume’s slices in rows. The last layer volume holds the scores for each class, but here we only visualize the sorted top 5 scores and print the labels of each one.

In the next article, we will discuss the individual layers and the details of their hyperparameters and their connectivities.

Handwritten digit recognition TensorFlow.js

License Plate Detection (ANPR) Part2

What’s New In Python 3.9

Breast Cancer Prediction

Sentiment analysis of amazon review data using LSTM

Sentiment analysis of Amazon product reviews

How to Compile C/C++ code in VS code (Windows)

Logistic Regression (Complete Theory and Python Implementation)

License Plate Recognition (All you need to know) (ANPR) Part1

Credit: BecomingHuman By: Sameer Bairwa

Previous Post

Repurposing an internal digital ad solution for client use

Next Post

Identifying compound classes through machine learning -- ScienceDaily

Related Posts

6 Major AI Use Cases In IT Operations | by Gina Shaw | Jan, 2021
Neural Networks

6 Major AI Use Cases In IT Operations | by Gina Shaw | Jan, 2021

January 21, 2021
Classifying employees as likely-to-quit using Tensorflow, Pandas & IBM attrition dataset | by Timilsinasandesh | Jan, 2021
Neural Networks

Classifying employees as likely-to-quit using Tensorflow, Pandas & IBM attrition dataset | by Timilsinasandesh | Jan, 2021

January 21, 2021
Does AI Raises Security and Ethics Concerns amid Pandemic | by Divyesh Dharaiya | Jan, 2021
Neural Networks

Does AI Raises Security and Ethics Concerns amid Pandemic | by Divyesh Dharaiya | Jan, 2021

January 21, 2021
How to Deploy AI Models? — Part 2 Setting up the Github For Herolu and Streamlit | by RAVI SHEKHAR TIWARI | Jan, 2021
Neural Networks

How to Deploy AI Models? — Part 2 Setting up the Github For Herolu and Streamlit | by RAVI SHEKHAR TIWARI | Jan, 2021

January 20, 2021
Big Data in Energy: Possibilities and Limitations | by Iflexion | Jan, 2021
Neural Networks

Big Data in Energy: Possibilities and Limitations | by Iflexion | Jan, 2021

January 20, 2021
Next Post
Applying artificial intelligence to science education — ScienceDaily

Identifying compound classes through machine learning -- ScienceDaily

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

January 6, 2019
Microsoft, Google Use Artificial Intelligence to Fight Hackers

Microsoft, Google Use Artificial Intelligence to Fight Hackers

January 6, 2019

Categories

  • Artificial Intelligence
  • Big Data
  • Blockchain
  • Crypto News
  • Data Science
  • Digital Marketing
  • Internet Privacy
  • Internet Security
  • Learn to Code
  • Machine Learning
  • Marketing Technology
  • Neural Networks
  • Technology Companies

Don't miss it

Better than the best password: How to use 2FA to improve your security
Internet Security

Better than the best password: How to use 2FA to improve your security

January 21, 2021
4Paradigm Defends its Championship in China’s Machine Learning Platform Market in the 1st Half of 2020, According to IDC
Machine Learning

4Paradigm Defends its Championship in China’s Machine Learning Platform Market in the 1st Half of 2020, According to IDC

January 21, 2021
The Content Habits and Preferences of Engineers
Marketing Technology

The Content Habits and Preferences of Engineers

January 21, 2021
Ransomware victims that have backups are paying ransoms to stop hackers leaking their stolen data
Internet Security

Ransomware victims that have backups are paying ransoms to stop hackers leaking their stolen data

January 21, 2021
Skyrim modders have a new machine learning tool that turns text to realistic NPC speech
Machine Learning

Skyrim modders have a new machine learning tool that turns text to realistic NPC speech

January 21, 2021
6 Major AI Use Cases In IT Operations | by Gina Shaw | Jan, 2021
Neural Networks

6 Major AI Use Cases In IT Operations | by Gina Shaw | Jan, 2021

January 21, 2021
NikolaNews

NikolaNews.com is an online News Portal which aims to share news about blockchain, AI, Big Data, and Data Privacy and more!

What’s New Here?

  • Better than the best password: How to use 2FA to improve your security January 21, 2021
  • 4Paradigm Defends its Championship in China’s Machine Learning Platform Market in the 1st Half of 2020, According to IDC January 21, 2021
  • The Content Habits and Preferences of Engineers January 21, 2021
  • Ransomware victims that have backups are paying ransoms to stop hackers leaking their stolen data January 21, 2021

Subscribe to get more!

© 2019 NikolaNews.com - Global Tech Updates

No Result
View All Result
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News

© 2019 NikolaNews.com - Global Tech Updates