Ladies and gentlemen, fasten your seatbelts, lean back and take a deep breath, for we are going to go on a bumpy ride!

Now, before you shoo me away for corny intros, let us delve deep right into the magical world of data science.

Firstly, do not be afraid, for we are not going to learn about algorithms filled with mathematical formulas which whoosh past right over your head. Instead, as mentioned in the title, we will take the help of SciKit Learn library, with which we can just call the required packages and get our results.

Easy, peasy.

But that doesn’t mean you do not need any knowledge of how these algorithms work from the inside. At one point or another, you do need to learn them, for you cannot avoid them forever. But we will discuss them some other day, so let’s focus on the task at hand here.

### Implementation

First, we import a few libraries-

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

Assuming that you know about numpy and pandas, I am moving on to Matplotlib, which is a plotting library in Python. Basically, this is the dude you want to call when you want to make graphs and charts.

The next step is to import our dataset ‘sample.csv’ (https://www.kaggle.com/rohankayan/years-of-experience-and-salary-dataset) and then split them into input (independent) variables and output (dependent) variable.

`dataset = pd.read_csv('sample.csv')`

x = dataset.iloc[:, :-1].values

y = dataset.iloc[:, 1].values

When you deal with real datasets, you usually have around thousands of rows but since the one I have taken here is a sample, this has just 30 rows. So when we split our data into a training set and a testing set, we split it in 1/3, i.e., 20 rows go into the training set and the rest 10 make it to the testing set.

from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 1/3)

Now, we will import the linear regression class, create an object of that class, which is the linear regression model.

`from sklearn.linear_model import LinearRegression`

lr = LinearRegression()

Then we will use the fit method to “fit” the model to our dataset. What this does is nothing but make the regressor “study” our data and “learn” from it.

`lr.fit(x_train, y_train)`

Now that we have created our model and trained it, it is time we test the model with our testing dataset.

`y_pred = lr.predict(x_test)`

And voila! You have successfully created a robust, working linear regression model. Pat yourself on the back and revel in your success!

**Visualization**

Wait, wait.

Do not start partying just yet, for we still have to visualize our data and create some charts.

First, we make use of a scatter plot to plot the actual observations, with x_train on the x-axis and y_train on the y-axis.

For the regression line, we will use x_train on the x-axis and then the predictions of the x_train observations on the y-axis.

We add a touch of aesthetics by coloring the original observations in red and the regression line in green.

plt.scatter(x_train, y_train, color = "red")

plt.plot(x_train, lr.predict(x_train), color = "green")

plt.title("Salary vs Experience (Training set)")

plt.xlabel("Years of Experience")

plt.ylabel("Salary")

plt.show()

Credit: BecomingHuman By: Sthitaprajna Mishra