Friday, April 23, 2021
  • Setup menu at Appearance » Menus and assign menu to Top Bar Navigation
Advertisement
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
No Result
View All Result
Home Machine Learning

What Is Gradient Boosting? | Unite.AI

January 10, 2021
in Machine Learning
What Is Gradient Boosting? | Unite.AI
586
SHARES
3.3k
VIEWS
Share on FacebookShare on Twitter

A common type of machine learning model that has managed to be extremely useful in data science competitions is a gradient boosting model. Gradient boosting is basically the process of converting weak learning models into strong learning models. Yet how exactly is this accomplished? Let’s take a closer look at gradient boosting algorithms and better understand how a gradient boosting model converts weak learners into strong learners.

Defining Gradient Boosting

This article aims to give you a good intuition for what gradient boosting is, without many breakdowns of the mathematics that underlie the algorithms. Once you have an appreciation for how gradient boosting operates at a high level you are encouraged to go deeper and explore the math that makes it possible.

You might also like

Global Federated Learning Solutions Market (2020 to 2028)

Can machine learning improve debris flow warning?

Twitter Updates its Responsible Machine Learning Initiative

Let’s start by defining what it means to “boost” a learner. Weak learners are converted into strong learners by adjusting the properties of the learning model. Exactly what learning algorithm is being boosted?

Boosting models work by augmenting another common machine learning model, a decision tree.

A decision tree model functions by splitting a dataset down into smaller and smaller portions, and once the subsets can’t be split any further the result is a tree with nodes and leaves. Nodes in a decision tree are where decisions about data points are made using different filtering criteria. The leaves in a decision tree are the data points that have been classified. Decision tree algorithms can handle both numerical and categorical data, and splits in the tree are based on specific variables/features.

Illustration of the way boosting models are trained.
Photo: SeattleDataBuy via Wikimedia Commons, CC 4.0 (https://commons.wikimedia.org/wiki/File:Boosting.png)

One type of boosting algorithm is the AdaBoost algorithm. AdaBoost algorithms start by training a decision tree model and assigning an equal weight to every observation. After the first tree has been evaluated for accuracy, the weights for the different observations are adjusted. Observations that were easy to classify have their weights lowered, while observations that were difficult to classify have their weights increased. A second tree is created using these adjusted weights, with the aim that the second tree’s predictions will be more accurate than the first tree’s predictions.

The model now consists of the predictions for the original tree and the new tree (or Tree 1 + Tree 2). The classification accuracy is assessed once more based on the new model. A third tree is created based on the calculated error for the model and the weights are once more adjusted. This process continues for a given number of iterations. The final model is an ensemble model that uses the weighted sum of the predictions made by all the previously constructed trees.

The process described above uses Decision Trees and the base predictors/models, yet a boosting approach can be carried out with a wide range of models like the many standard classifier and regressor models. The key concepts to understand are that subsequent predictors learn from the mistakes made by previous one and that the predictors are created sequentially.

The primary advantage of boosting algorithms is that they take less time to find the current predictions when compared to other machine learning models. Care needs to be used when employing boosting algorithms, however, as they are prone to overfitting.

Gradient Boosting

We’ll now look at one of the most commonly boosting algorithms. Gradient Boosting Models (GBM) are known for their high accuracy and they augment the general principals used in AdaBoost.

The primary difference between a Gradient Boosting Model and AdaBoost is that GBMs use a different method of calculating which learners are misidentifying data points. AdaBoost calculates where a model is underperforming by examining data points that are heavily weighted. Meanwhile, GBMs uses gradients to determine the accuracy of learners, applying a loss function to a model. Loss functions are a way to measure the accuracy of a model’s fit on the dataset, calculating an error and optimizing the model to reduce that error. GBMs let the user optimize a specified loss function based on their desired goal.

Taking the most common loss function – Mean Squared Error (MSE) – as an example, gradient descent is used to update predictions based on a predefined learning rate, aiming to find the values where loss is minimal.

To make it clearer:

New model predictions = output variables – old imperfect predictions.

In a more statistical sense, GBMs aim to find relevant patterns in a model’s residuals, adjusting the model to fit the pattern and bring the residuals as close to zero as possible. If you were to carry out a regression on the model’s predictions, the residuals would be distributed around 0 (perfect fit), and GBMs are finding patterns within the residuals and updating the model around these patterns.

In other words, the predictions are updated so that the sum of all residuals is as close to 0 as possible, meaning that the predicted values will be very close to the actual values.

Note that a wide variety of other loss functions (such as logarithmic loss) can be used by a GBM, MSE was selected above for the purpose of simplicity.

Variations On Gradient Boosting Models

Gradient Boosting Models are greedy algorithms that are prone to overfitting on a dataset. This can be guarded against with several different methods that can improve the performance of a GBM. GBMs can be regulated with four different methods: Shrinkage, Tree Constraints, Stochastic Gradient Boosting, and Penalized Learning.

Shrinkage

As previously mentioned, in GBMs predictions are summed together in a sequential fashion. In “Shrinkage”, the additions of every tree to the overall sum are adjusted. Weights are applied that slow down the algorithm’s learning rate, necessitating that more trees be added to the model, which typically improves the model’s robustness and performance. The trade off is that the model takes longer to train.

Tree Constraints

Constraining the tree with various tweaks like adding more depth to the tree or increasing the number of nodes or leaves in the tree can will make it harder for the model to overfit. Imposing a constraint on the minimum number of observations per split has a similar effect. Once more, the trade off is that it will take the model longer to train.

Random Sampling

The individual learners can be created through a stochastic process, based on randomly selected substamples of the training dataset. This has the effect of reducing correlations between trees, which guards against overfitting. The dataset can be substampled before creating the trees or before considering a split in the tree.

Penalized Learning

Beyond constraining the model through limiting the structure of the tree, it’s possible to use a regression tree. REgression trees have numerical values attached to each of the leaves, these function as weights and they can be adjusted with common regularization functions like L1 and L2 regularization.

Credit: Google News

Previous Post

Should you worry about hackers cloning your 2FA hardware security keys?

Next Post

SolarWinds hires Chris Krebs and Alex Stamos as part of security review

Related Posts

Artificial Intelligence and Machine Learning: Demographics & Firmographics
Machine Learning

Global Federated Learning Solutions Market (2020 to 2028)

April 23, 2021
Basic laws of physics spruce up machine learning
Machine Learning

Can machine learning improve debris flow warning?

April 23, 2021
Twitter Updates its Responsible Machine Learning Initiative
Machine Learning

Twitter Updates its Responsible Machine Learning Initiative

April 22, 2021
Test Your Machine Learning Model through Model Accuracy
Machine Learning

Test Your Machine Learning Model through Model Accuracy

April 22, 2021
Machine learning helps researchers decipher the Dead Sea Scrolls
Machine Learning

Machine learning helps researchers decipher the Dead Sea Scrolls

April 22, 2021
Next Post
Willyama’s role in helping Indigenous Australians secure a career in cybersecurity

SolarWinds hires Chris Krebs and Alex Stamos as part of security review

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

January 6, 2019
Microsoft, Google Use Artificial Intelligence to Fight Hackers

Microsoft, Google Use Artificial Intelligence to Fight Hackers

January 6, 2019

Categories

  • Artificial Intelligence
  • Big Data
  • Blockchain
  • Crypto News
  • Data Science
  • Digital Marketing
  • Internet Privacy
  • Internet Security
  • Learn to Code
  • Machine Learning
  • Marketing Technology
  • Neural Networks
  • Technology Companies

Don't miss it

Evolving ITOps with AIOps with no-code AI training with Cloud Pak for Watson AIOps – IBM Developer
Technology Companies

Evolving ITOps with AIOps with no-code AI training with Cloud Pak for Watson AIOps – IBM Developer

April 23, 2021
Best free PC antivirus software in 2021
Internet Security

Best free PC antivirus software in 2021

April 23, 2021
Cybercriminals Using Telegram Messenger to Control ToxicEye Malware
Internet Privacy

Cybercriminals Using Telegram Messenger to Control ToxicEye Malware

April 23, 2021
Strategies for a successful Voice of the Customer program
Data Science

Strategies for a successful Voice of the Customer program

April 23, 2021
European Values Confront AI Innovation in EU’s Proposed AI Act  
Artificial Intelligence

European Values Confront AI Innovation in EU’s Proposed AI Act  

April 23, 2021
Artificial Intelligence and Machine Learning: Demographics & Firmographics
Machine Learning

Global Federated Learning Solutions Market (2020 to 2028)

April 23, 2021
NikolaNews

NikolaNews.com is an online News Portal which aims to share news about blockchain, AI, Big Data, and Data Privacy and more!

What’s New Here?

  • Evolving ITOps with AIOps with no-code AI training with Cloud Pak for Watson AIOps – IBM Developer April 23, 2021
  • Best free PC antivirus software in 2021 April 23, 2021
  • Cybercriminals Using Telegram Messenger to Control ToxicEye Malware April 23, 2021
  • Strategies for a successful Voice of the Customer program April 23, 2021

Subscribe to get more!

© 2019 NikolaNews.com - Global Tech Updates

No Result
View All Result
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News

© 2019 NikolaNews.com - Global Tech Updates