Monday, April 12, 2021
  • Setup menu at Appearance » Menus and assign menu to Top Bar Navigation
Advertisement
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
No Result
View All Result
Home Neural Networks

Recurrent All-Pairs Field Transforms for Optical Flow | by Pierrick RUGERY | Oct, 2020

October 22, 2020
in Neural Networks
Recurrent All-Pairs Field Transforms for Optical Flow | by Pierrick RUGERY | Oct, 2020
586
SHARES
3.3k
VIEWS
Share on FacebookShare on Twitter

Feature Encoder

The role of the feature encoder is to extract a vector for each pixel of an image. The feature encoder consists of 6 residual blocks, 2 at 1/2 resolution, 2 at 1/4 resolution, and 2 at 1/8 resolution.

figure 5: Residual block

The common culture believes that increasing the depth of a convoluted network increases its accuracy. However, this is a misconception, as increasing the depth can lead to saturation due to problems such as the vanishing gradient. To avoid this, we can add residual blocks that will reintroduce the initial information at the output of the convolution layer and add it up.

You might also like

A Primer of 29 Interactions for AI

Univariate Linear Regression: Explained with Examples | by WeiQin Chuah | Apr, 2021

Disentangling AI, Machine Learning, and Deep Learning | by James Montantes | Apr, 2021

1. Fundamentals of AI, ML and Deep Learning for Product Managers

2. The Unfortunate Power of Deep Learning

3. Graph Neural Network for 3D Object Detection in a Point Cloud

4. Know the biggest Notable difference between AI vs. Machine Learning

We additionally use a context network. The context network extracts features only from the first input image 1(Frame 1 on the figure 4). The architecture of the context network, hθ is identical to the feature extraction network. Together, the feature network gθ and the context network hθ form the first stage of this approach, which only need to be performed once.

Multi-Scale 4D correlation volumes

To build the correlation volume, simply apply the dot product between all the output features of the feature encoder, the dot product represents the alignent between all feature vectors. The output is the dimensional correlation volume C(gθ(I1),gθ(I2))∈R^H×W×H×W.

Correlation Pyramid

figure 6: Correlation pyramid

They construct a 4-layer pyramid {C1,C2,C3,C4} by pooling average the last two dimensions of the correlation volume with kernel sizes 1, 2, 4, and 8 and equivalent stride (Figure 6). Only the last two dimensions are selected to maintain high-resolution information that will allow us to better identify the rapid movements of small objects.

C1 shows the pixel-wise correlation, C2 shows the 2×2-pixel-wise correlation, C3 shows the 4×4-pixel-wise.

Figure 7: Average Pooling part 1

An exemple of average pooling with kernel size = 2, stride = 2. As you can see on figure 7 and 8, Average Pooling compute the average selected by the kernel.

figure 8: Average Pooling final result

Thus, volume Ck has dimensions H × W × H/2^k× W/2^k . The set of volumes gives information about both large and small displacements.

Correlation Lookup

They define a lookup operator LC which generates a feature map by indexing from the correlation pyramid.

How does indexing work?

Each layer of the pyramid has one dimension C_k = H × W × H/(2^k)× W/(2^k). To index it, we use a bilinear interpolation which consists in projecting the matrix C_k in a space of dimension H × W × H × W.

Figure 9: bilinear interpolation

To define a point à coordinate (x, y), we used four points (x1, y2), (x1, y1), (x2, y2) and (x2, y1). Then compute new point (x, y2), (x1, y), (x, y1) and (x2, y) by interpolation, then repeat this step to define (x, y).

Given a current estimate of optical flow (f1,f2), we map each pixel x = (u,v) in image 1 to its estimated correspondence in image 2: x′ = (u + f1(u), v + f2(v)).

Iterative Updates

For each pixel of image 1 the optical flux is initialized to 0, from image 1 we will look for the function f which allows to determine the position of the pixel within image 2 from the multi scale correlation. To estimate the function f, we will simply look for the place of the pixel in image 2.

We will try to evaluate function f sequentially {f1, …, fn}, as we could do with an optimization problem. The update operator takes flow, correlation, and a latent hidden state as input, and outputs the update ∆f and an updated hidden state.

By default the flow f is set to 0, we use the features created by the 4D correlation, and the image to predict the flow. The correlation allows to index to estimate the displacement of a pixel. We then use the indexing between the correlation matrix and image 1 in output of the encoder context to estimate a pixel displacement between image 1 and 2. To do this, a GRU (Gated Recurrent Unit) recurrent network is used. As input the GRU will have the concatenation of the 4D correlation matrix, the flow and image 1 as output of the context encoder. A normal GRU works with fully connected neural networks, but in this case, in order to adapt it to a computer vision problem, it has been replaced by convoluted neural networks.

figure 10: GRU with convolution

To simplify a GRU works with two main parts : the reset gate that allows to remove non-essential information from ht-1 and the update gate that will define ht. The gate remainder is mainly used to reduce the influence of ht-1 in the prediction of ht. In the case where Rt is close to 1 we find the classical behavior of a recurrent neural network. . This determines the extent to which the new state ht is just the old state ht-1 and by how much the new candidate state ~ht is used.

figure 11: GRU with with fully connected layers

In our case we use the GRU, to update the optical flow. We use the correlation matrix and image 1 as input and we try to determine the optical flux, i.e. the displacement of a pixel u(x, y) in image 2. We use 10 iteration, i.e. we calculate f10 to estimate the displacement of this pixel.

Loss

figure 12: Loss function

The loss function is defined as the L1 standard between the ground truth flow and the calculated optical flow. We add a Gamma weight that increases exponentially, meaning that the longer the iteration in the GRU the more the error will be penalized. For example let’s take the case where fgt = 2, f0 = 1.5 and f10= 1.8. If we calculate the error L0 = 0.8¹⁰*0.5 = 0.053, L10 = 0.8⁰*0.2=0.2. We quickly understand that an error will be more important at the final iteration than at the beginning.

figure 13: example of optical flow prediction with RAFT

Credit: BecomingHuman By: Pierrick RUGERY

Previous Post

The ROI of SEO Campaigns

Next Post

CYR3CON's Machine Learning Platform Predicted Exploits Later Used by State-Sponsored Hackers

Related Posts

A Primer of 29 Interactions for AI
Neural Networks

A Primer of 29 Interactions for AI

April 10, 2021
Univariate Linear Regression: Explained with Examples | by WeiQin Chuah | Apr, 2021
Neural Networks

Univariate Linear Regression: Explained with Examples | by WeiQin Chuah | Apr, 2021

April 10, 2021
Disentangling AI, Machine Learning, and Deep Learning | by James Montantes | Apr, 2021
Neural Networks

Disentangling AI, Machine Learning, and Deep Learning | by James Montantes | Apr, 2021

April 9, 2021
Artificial Intelligence Courses, books, and programs for entrepreneurs | by Farhad Rahbarnia | Apr, 2021
Neural Networks

Artificial Intelligence Courses, books, and programs for entrepreneurs | by Farhad Rahbarnia | Apr, 2021

April 9, 2021
Co-founder Guide: Time and Goal Management | by Farhad Rahbarnia | Apr, 2021
Neural Networks

Co-founder Guide: Time and Goal Management | by Farhad Rahbarnia | Apr, 2021

April 9, 2021
Next Post
Machine Learning Comes to MariaDB Open Source Database with MindsDB Integration

CYR3CON's Machine Learning Platform Predicted Exploits Later Used by State-Sponsored Hackers

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

January 6, 2019
Microsoft, Google Use Artificial Intelligence to Fight Hackers

Microsoft, Google Use Artificial Intelligence to Fight Hackers

January 6, 2019

Categories

  • Artificial Intelligence
  • Big Data
  • Blockchain
  • Crypto News
  • Data Science
  • Digital Marketing
  • Internet Privacy
  • Internet Security
  • Learn to Code
  • Machine Learning
  • Marketing Technology
  • Neural Networks
  • Technology Companies

Don't miss it

Cambridge Quantum Computing Pioneers Quantum Machine Learning Methods for Reasoning
Machine Learning

Cambridge Quantum Computing Pioneers Quantum Machine Learning Methods for Reasoning

April 11, 2021
Why Machine Learning Over Artificial Intelligence?
Machine Learning

Why Machine Learning Over Artificial Intelligence?

April 11, 2021
27 million galaxy morphologies quantified and cataloged with the help of machine learning
Machine Learning

27 million galaxy morphologies quantified and cataloged with the help of machine learning

April 11, 2021
Machine learning and big data needed to learn the language of cancer and Alzheimer’s
Machine Learning

Machine learning and big data needed to learn the language of cancer and Alzheimer’s

April 11, 2021
Job Scope For MSBI In 2021
Data Science

Job Scope For MSBI In 2021

April 11, 2021
Basic laws of physics spruce up machine learning
Machine Learning

New machine learning method accurately predicts battery state of health

April 11, 2021
NikolaNews

NikolaNews.com is an online News Portal which aims to share news about blockchain, AI, Big Data, and Data Privacy and more!

What’s New Here?

  • Cambridge Quantum Computing Pioneers Quantum Machine Learning Methods for Reasoning April 11, 2021
  • Why Machine Learning Over Artificial Intelligence? April 11, 2021
  • 27 million galaxy morphologies quantified and cataloged with the help of machine learning April 11, 2021
  • Machine learning and big data needed to learn the language of cancer and Alzheimer’s April 11, 2021

Subscribe to get more!

© 2019 NikolaNews.com - Global Tech Updates

No Result
View All Result
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News

© 2019 NikolaNews.com - Global Tech Updates