Gradient Descent has been one of the popular optimization algorithms present in current times. Momentum is its greatest friend. Momentum provided a new perspective to research in optimization. From months, I was trying to understand backpropogation and when I learned, it felt totally overwhelming.

This goes as a tribute to the Mother of Artificial Intelligence which is Gradient Descent.

Here’s a more mathematical explanation of Gradient Descent and its variants → https://colab.research.google.com/drive/1lNhdf4TwPvQrN3CKyGhPmtxC9uOGGKZW#scrollTo=N9jb8SnWyDx1&forceEdit=true&offline=true&sandboxMode=true

### Trending AI Articles:

1. Making a Simple Neural Network

2. From Perceptron to Deep Neural Nets

3. Neural networks for solving differential equations

4. Turn your Raspberry Pi into homemade Google Home

### The Birth of the Mother

The first question which comes in most of our minds is that:

Why do we need optimization algorithms? Do they really have uses in economics and mathematics?

Here’s where you will see Mathematical Optimization. Wikipedia says,

In the simplest case, an optimization problem consists of maximizing or minimizing a real function by systematically choosing input values from within an allowed set and computing the value of the function.

In simpler words, a mathematical function is optimized in order to get the minimum or maximum results out of a function. Suppose, if we have a function *f( x ) = x²*

Now, the minima of this function or a particular point on the line which has the minimum value are *(0, 0 ).* If *x = 0* and then *y = 0*. You can’t get a value that’s smaller than 0.

#### Gradient Descent

Initially, the Random optimization algorithm was used for optimization. It consisted of random picking up a set of values, plugging them in the function, and fetch the results. The smallest result was considered as the minima.

Gradient Descent proposed a newer and efficient way to reach the minima or maxima ( opposite of minima ). It devised the use of *gradient* or *slope *of a function which needs to be optimized.

### What is the gradient of a function? ( No calculus guaranteed! )

In simpler words, the gradient is the slope of the tangent line to a curve at a specific point. For example, we take our *f( x ) = x²* function. The gradient could be calculated producing a *derivative* of the function

For a multivariable function, the gradient is a vector of all the partial derivatives of a function.

Credit: BecomingHuman By: Shubham Panchal