Monday, March 1, 2021
  • Setup menu at Appearance » Menus and assign menu to Top Bar Navigation
Advertisement
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
No Result
View All Result
Home Neural Networks

Machine Learning Series Day 1 (Linear Regression) – Becoming Human: Artificial Intelligence Magazine

February 7, 2019
in Neural Networks
Machine Learning Series Day 1 (Linear Regression) – Becoming Human: Artificial Intelligence Magazine
588
SHARES
3.3k
VIEWS
Share on FacebookShare on Twitter

Credit: BecomingHuman

Details:

  • Supervised/Unsupervised: Supervised
  • Regression/Classification: Regression

Visual:

The goal is to best fit the line on the data values. Meaning, we calculate the “line” that produces the least error (sum of total).

You might also like

How AI Can Be Used in Agriculture Sector for Higher Productivity? | by ANOLYTICS

Future Tech: Artificial Intelligence and the Singularity | by Jason Sherman | Feb, 2021

Tackling ethics in AI algorithms: the case of Salesforce | by Iflexion | Feb, 2021

Mathematics/Statistics:

Formula — (Linear Regression):

  • X’s variables (Inputs): These are the inputs values that are in the dataset. If you’re predicting employee salaries, some inputs could be age, level of education, or their location.
  • β0 (Bias Term): A Bias term is needed unless we believe that the model starts at the origin. This bias term is where the line intercepts the y-axis. The y-intercepts matters. For example, if we were to predict a babies weight — a baby cannot weigh 0lbs (assuming they were born).
  • Β1… Bp-1 variables (coefficients): We multiply the X variables with the weights/betas (represented by β). The betas are what the model is calculating. As you’ll later see, it’s how we manipulate the line below to fit the dependent variable.
  • The epsilon (ϵ): Represents the error term or the inability to be 100% of our data. We can never expect that our data is an accurate representation of the population.

The objective is to predict the betas (weights) that “fits” the data values the best.

They are the coefficients to be multiplied with the inputs. If there were only one predictor, the goal would be to find a linear line that best “fits” the data points.

“What next?”

  • The evaluation method must be established: cost function. The cost function is typically the Mean Squared Error (MSE).

Formula — MSE (Mean Squared Error):

The MSE formula measures the average squared difference of the summation of the observed and actual.

  • y1: The ground truth
  • W*X: Multiplying our weights by the X variables to get a prediction.
  • W0: Not shown in the function, but we will use a bias term.

So the process is relatively straight-forward. Let’s use an example:

  1. You’re tasked with predicting people’s salary. Age is your independent variable. You will probably have more variables than age, but let’s stick with one variable. You will use the training set to measure accuracy.
  2. If you were only to find the pattern for one person whose salary is $100,000 and is 50-years old, the coefficient (beta) would be 2000. Meaning, for every additional age, a person’s expected to earn $2000 more.
  3. There are now are two people. The second person earns $60,000 but is 20-years-old. Hence, our weight of 2000 does not work anymore.

Our job is to find a weight that minimizes the error in average for all the observations.

The task becomes very complicated when there are a lot of variables and observations.

Additional Notes (MSE):

  • We square the difference to calculate their positive value. The capital sigma, Σ, means we sum the variations of the predictions and ground truths. Lastly, we divide by the total number of observations (N) to get the average.
  • The reason we use “2N” instead of N to facilitate when calculating the derivatives.

Our goal is to minimize the Cost Function (MSE).

“So how do we minimize the MSE?”

With the gradient descent.

Formula — Gradient Descent (Simple Regression):

The diagram illustrates (look below) the goal of the gradient descent. We need to “reach” the bottom of the parabola. The slopes are drawn by the straight colored lines (green, yellow, and red). The steeper the slope, the quicker we reach to the bottom. Hence, the objective is to find the slope that can help us “reach” the bottom the quickest. The goal is to reach the base as efficiently as possible from the current position.

For a single linear regression problem, we calculate the derivatives of our independent variables (including the bias). The equations below are the derivatives for m and b. Realize that m is the same as betas/weights in single linear regression (terminology is often inconsistent).

  • We calculate the derivatives of the bias and weights for each observation. We then sum the derivatives to derive the average.
  • The last step is updating the betas. These derivatives calculate how we should tweak the betas.
  • The formula is original_beta minus (learning_rate * derivative).
  • The derivative is the direction we want to move towards, and the learning rate is how fast we would like to proceed. The learning rate is typically 0.05 but could be adjusted.

Formula — Gradient Descent (Multi-Linear Regression):

More realistically, you’ll be dealing with a Multi-Linear Regression problem. Hence, the equations below described the partial derivatives of each of the possible weights, which are associated with each independent variable. The “h0(x)” term is our prediction.

If you’re rock climbing, you can either walk forward, backward or up and down. The partial derivatives described how much you should in a 3-Dimensional plane.

Is there anything else?

Yes. One of the difficulties in machine learning is a concept called overfitting. Overfitting is when our model performs well on the training set, the dataset the models learn, but not as well in the validation or testing set, the dataset that the models are validated with or tested with.

One technique to reduce overfitting is by shrinking your weights/betas.

It’s a fascinated concept. The strongest beta/weigh will still have the most robust beta/weight among its the other independent variables.

However, the magnitude will be reduced for all independent variables.

We have the ridge/lasso regression.

Credit: BecomingHuman By: Alex Guanga

Previous Post

Marketing Creativity vs. Marketing Technology

Next Post

Digital Thermography and Machine Learning Team Up to Improve Burn Wound Care

Related Posts

How AI Can Be Used in Agriculture Sector for Higher Productivity? | by ANOLYTICS
Neural Networks

How AI Can Be Used in Agriculture Sector for Higher Productivity? | by ANOLYTICS

February 27, 2021
Future Tech: Artificial Intelligence and the Singularity | by Jason Sherman | Feb, 2021
Neural Networks

Future Tech: Artificial Intelligence and the Singularity | by Jason Sherman | Feb, 2021

February 27, 2021
Tackling ethics in AI algorithms: the case of Salesforce | by Iflexion | Feb, 2021
Neural Networks

Tackling ethics in AI algorithms: the case of Salesforce | by Iflexion | Feb, 2021

February 27, 2021
Creative Destruction and Godlike Technology in the 21st Century | by Madhav Kunal
Neural Networks

Creative Destruction and Godlike Technology in the 21st Century | by Madhav Kunal

February 26, 2021
How 3D Cuboid Annotation Service is better than free Tool? | by ANOLYTICS
Neural Networks

How 3D Cuboid Annotation Service is better than free Tool? | by ANOLYTICS

February 26, 2021
Next Post
Digital Thermography and Machine Learning Team Up to Improve Burn Wound Care

Digital Thermography and Machine Learning Team Up to Improve Burn Wound Care

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

January 6, 2019
Microsoft, Google Use Artificial Intelligence to Fight Hackers

Microsoft, Google Use Artificial Intelligence to Fight Hackers

January 6, 2019

Categories

  • Artificial Intelligence
  • Big Data
  • Blockchain
  • Crypto News
  • Data Science
  • Digital Marketing
  • Internet Privacy
  • Internet Security
  • Learn to Code
  • Machine Learning
  • Marketing Technology
  • Neural Networks
  • Technology Companies

Don't miss it

The Bayesian vs frequentist approaches: implications for machine learning – Part two
Data Science

The Bayesian vs frequentist approaches: implications for machine learning – Part two

March 1, 2021
Google’s deep learning finds a critical path in AI chips
Machine Learning

Google’s deep learning finds a critical path in AI chips

March 1, 2021
9 Tips to Effectively Manage and Analyze Big Data in eLearning
Data Science

9 Tips to Effectively Manage and Analyze Big Data in eLearning

March 1, 2021
Machine Learning & Big Data Analytics Education Market 2021 Global Industry Size, Reviews, Segments, Revenue, and Forecast to 2027 – NeighborWebSJ
Machine Learning

Machine Learning & Big Data Analytics Education Market 2021 Global Industry Size, Reviews, Segments, Revenue, and Forecast to 2027 – NeighborWebSJ

March 1, 2021
The Future of AI in Insurance
Data Science

The Future of AI in Insurance

March 1, 2021
Machine Learning as a Service (MLaaS) Market Analysis Technological Innovation by Leading Industry Experts and Forecast to 2028 – The Daily Chronicle
Machine Learning

Machine Learning as a Service (MLaaS) Market Global Sales, Revenue, Price and Gross Margin Forecast To 2028 – The Bisouv Network

March 1, 2021
NikolaNews

NikolaNews.com is an online News Portal which aims to share news about blockchain, AI, Big Data, and Data Privacy and more!

What’s New Here?

  • The Bayesian vs frequentist approaches: implications for machine learning – Part two March 1, 2021
  • Google’s deep learning finds a critical path in AI chips March 1, 2021
  • 9 Tips to Effectively Manage and Analyze Big Data in eLearning March 1, 2021
  • Machine Learning & Big Data Analytics Education Market 2021 Global Industry Size, Reviews, Segments, Revenue, and Forecast to 2027 – NeighborWebSJ March 1, 2021

Subscribe to get more!

© 2019 NikolaNews.com - Global Tech Updates

No Result
View All Result
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News

© 2019 NikolaNews.com - Global Tech Updates