## Gradient descent is the most common method for optimization.

This is the second part of the series Optimization, In this blog post, we’ll continue to discuss the Rod Balancing problem and try to solve using Gradient descent method. In Part-1 we understood what is an optimization and tried to solve the same problem using **Exhaustive Search***(a gradient-free method)*. If you haven’t read that here is the link.

G**radient-Based Algorithms **are usually much faster than gradient-free methods, the whole objective here is to always try to improve over time i.e., somehow try to make the next step which results in a better solution than previous one. The **Gradient Descent Algorithm** is one of the most well-known gradients based algorithm. The Decision Variables used here are continuous ones since it gives more accurate gradients(slope) at any given point on the curve.

So, to solve our problem with gradient descent we’ll reframe our Objective function to a **minimization **problem. To do this we’ll make an **assumption **and define our **cost function***(it is also sometimes known as** loss function** or **error function**). *We will assume that the best solution would be the one which can balance the rod for at least 10 seconds *(let’s state this assumption as ‘**y’**)*. The cost function at its base is the function which returns the difference of the actual output and desired output. For our problem, the cost function would become:

Note:we squared the difference to avoid negative values or you can just take absolute value, either will work.

Now for every test result **[ f(x)] **i.e., the time in seconds, the rod stayed on the finger, we can calculate our cost

**[**. So our Objective function would now change to minimizing

*C(x)*]*C(x)*instead of maximizing

*f(x),*we can state the modified Objective function as,

Since Objective function is changed now, our curve also gets inverse, i.e., on the y-axis instead of time we plot cost and will try to minimize it.

Any Gradient descent based algorithm follows 3 step procedure:**1. Search direction.2. Step size3. Convergence check.**

Once we know the error, we have to find the direction of where we should move our finger on the rod for a better solution. The direction is decided by taking the **derivative **of the cost function with respect to the decision variable(s). This simply means calculating **slope***(‘**dC/dx’ **)*** **on the curve for a specific value of the decision variable, this slope is known as the **gradient**. The greater the slope, the further we are from the **minima***(i.e., the lowest point on the curve)*.