Often heard, but rarely understood: machine learning and deep learning. Both are techniques for realizing artificial intelligence, which is the ability of computers to perform tasks commonly associated with intelligent beings.
Let’s clear the confusion today. In this post we explain machine learning and deep learning in brief. Additionally, we take a look at the computer vision field since this is our daily work.
Machine learning is the method of teaching a machine to solve a problem by observing examples instead of manually implementing how the machine should solve the problem.
In a nutshell, machine learning works like this: By training a machine learning model (so-called neural networks, NN) with input data (i.e. sensor data) and the outputs we expect (i.e. increase or decrease some motor) it learns how to solve that problem. When the model is trained properly it can get real world inputs and hopefully delivers the exptected output.
It turned out that this way a lot of real world problems can be solved, but very far away of what a human being can do.
Deep learning is the evolution of machine learning. In fact, it’s a rebranding of machine learning as it uses complex machine learning methods.
Deep learning really startet in 2006 with the use of CNN models (Convolutional neural network). The increase in computational power allowed to train more complex machine learning models. If a model is more complex you need more data to train it properly. The internet made it possible to create data sets that are big enough for complex models.
It’s important to know that machine learning models are organized in layers. Very complex models have a lot of layers, one after another. That’s why they are called deep learning models and this is why deep learning is deep.
A deep learning model works like this: Each layer processes its inputs and passes the results to the next layer, and so on until the last layer which infers the whished output (for example a class or detection result when it comes to images).
And what is computer vision anyway? Computer vision enables computers to see, identify and process images like humans do.
So far, deep learning is the best method for computer vision since it can solve problems related to complex inputs: images. For those inputs very deep models are needed. This is why deep learning is applied for computer vision problems.
The improvements in deep learning from 2012 to now are huge, and especially computer vision tasks benefit from that. Back in 2012 it was possible to classify an image into 1000 categories. Now, deep learning is used for image super resolution, image style transfer, and self-driving cars, just to name a few examples.
We at pixolution started in 2009 with a handcrafted visual search algorithm and began to experiment with deep learning in 2011. In 2017 we released our first CNN based version of our product pixolution flow.
Today, we’re capable of handling various projects related to classification problems or search tasks. For example we’ve built a font recognition engine capable of classifying fonts. Recently, we’ve created a prototype that finds suitable regions in images to add content, for example the price to an illustrated product. And deep learning enables all these use cases.
If you have any need for individual AI models adapted to your individual data set, let’s have a chat. We’re thrilled to face new challenges and to be part of the deep learning evolution.