Neural networks can be daunting when you don’t know much about them, especially when the internet is full of videos showing enormous and complex ones. The goal of this article is to introduce anyone (even the most complete beginner) to one of the building blocs of neural networks; The Perceptron (to be more precise, *single layer Perceptrons*). We will write a functioning one in python and hopefully you will gain some intuition on how they work. The article is divided into 2 parts, first we’ll investigate the different components of a Perceptron and how they work. Only once this is done we will get into coding one (Note that while the first part can be understood with no prior knowledge, for the second part I expect you to be somewhat familiar with python)

I believe explaining each element of the diagram above is a good way to get a first mental image of what a Perceptron is, but first let me phrase the problem it is going to solve:

*We want to separate (x, y) points into 2 categories: those above a certain line and those below*

Put simply, we are going to give our Perceptron a point (more precisely, its coordinates) and ask it the question: “Is this point above or below the line”, the line I’m talking about is one that we will have chosen beforehand, the one we will **train **our Perceptron on (Interesting terms coming up).

You may think that this is kind of a lame problem to solve… And I would agree with you, it doesn’t sound interesting and it could be solved really easily (and perfectly) using simple math… **A single layer Perceptron is quite limited**, in fact they are only able to solve problems similar to this one, but the goal here is not to solve any kind of fancy problem, it is to understand **how the Perceptron is going to solve this simple problem.**

Okay, now that we know what our goal is, let’s take a look at this Perceptron diagram, what do all these letters mean. Let’s first take a look at the X’s, what are they and why are they here. The X_0 is a special one (as you may have noted from the “*=1*” next to it) so please don’t pay attention to it right now, it will make an enormous amount of sense when I get to it. The other X’s represent the **input**. If we want to ask our Perceptron whether a point is above or below the line we have chosen, it must know about the point, this is where we give this information. You may have guessed it, in our case we will only need a X_1 and a X_2 as our *input *(point) can be represented with only its **y** and its **x** coordinate.

Great! We know what the X’s mean (be patient for X_0 😉), then what are the W’s linking them to the weird green circle. They are the

weights, they represent how important one of the input is (You could phrase it as “How muchweightthis part of the input has in the decision making process”).This is where the whole learning process happens, basically we will tweak these until we are satisfied with the results they yield, but I am getting ahead of myself here, this is for later.Now to the weird green circle, allow me to call it the

processor(I don’t know if there is an official name for it butweird green circleis not going to cut it). The fact that there are 2 different symbols in there already hints at there being 2 different processes. The first part (Sigma) refers to theweighted sum of the inputs,meaning the processor will calculate an internal value that is the sum of each input multiplied by its corresponding weight. This is represented by the following, not-yet-too-awful, formula:Before getting to the second part, this is where X_0 makes its great appearance. There is no better way to explain it to you than to show you its reason to be:

What would happen if all inputs where to be equal to 0?Exactly,the sum would be 0.But 0 is such a special case, we don’t want it to be forced on the Perceptron by the input, we want weights to be able to change the inputs required to produce 0. This is why X_0 (and W_0!) are introduced, they form what we call thebias, a value that is completely independent from the input and that the processor will add to the weighted sum.The second part of the processor (Phi) refers to the

formattingpart. As you can see from the formula above,the sum calculated by the processor can have any real value(depending on the input and the weights). But that may not be what we want, take our problem for example, we want to know whether a point is above or below a line, that means we expect only 2 different numbers in the output (we’ll use 1 and -1) not a whole range of number between minus infinity and infinity. That is why for our formatting we will use thesignfunction (giving -1 when a number is below 0 and 1 when a number is greater or equal to 0). After this formatting process, you finally get …. theoutput!(Yay I guess)Alright that’s enough theory, I haven’t talked in length about how the Perceptron is going to learn but I feel like this is something that is easier to understand while writing the code so bear with me. Anyways, you and I just want to get to building this thing, let’s get coding.

Finally getting down to the real thing, going forward I suppose you have a python file opened in your favorite IDE. We’ll start by creating the Perceptron class, in our case we will only need 2 inputs but we will create the class with a variable amount of inputs in case you want to toy around with the code later. In the constructor we will also initialize the bias and the weights (there is a lot of talk about how to initialize those but here we are working on such a simple problem that initializing them randomly will do the trick). After all this comes the following code

Credit: BecomingHuman By: Tom Gautot