Thursday, March 4, 2021
  • Setup menu at Appearance » Menus and assign menu to Top Bar Navigation
Advertisement
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
No Result
View All Result
Home Data Science

Is BERT Always the Better Cheaper Faster Answer in NLP? Apparently Not.

September 24, 2020
in Data Science
Is BERT Always the Better Cheaper Faster Answer in NLP? Apparently Not.
593
SHARES
3.3k
VIEWS
Share on FacebookShare on Twitter

Summary: Since BERT NLP models were first introduced by Google in 2018 they have become the go-to choice.  New evidence however shows that LSTM models may widely outperform BERT meaning you may need to evaluate both approaches for your NLP project.

 

You might also like

A Plethora of Machine Learning Articles: Part 2

The Effect IoT Has Had on Software Testing

Why Cloud Data Discovery Matters for Your Business

Over the last year or two, if you needed to bring in an NLP project quickly and with SOTA (state of the art) performance, increasingly you reached for a pretrained BERT module as the starting point.

BERT (Bidirectional Encoder Representations for Transformers) has been heralded as the go-to replacement for LSTM models for several reasons:

  • It’s available as off the shelf modules especially from the TensorFlow Hub Library that have been trained and tested over large open datasets. These are then used as the baseline in transfer learning which fine tunes your resulting NLP application.
  • Because they are based on the transformer element of DNNs and not the recursive structure of RNNs / LSTMs, they lend themselves to parallelization which can both speed up modeling and reduce cost.
  • Also, the transformer is the attention mechanism of BERT that takes into account the context of the words in the entire sentence at one time. As opposed to LSTMs which must look both forward and backwards (directionally recurs) along the line of words to extract meaning.  This is believed to give BERTs an advantage in accuracy.

Recently however there is growing evidence that BERT may not always give the best performance.

In their recently released arXiv paper, Victor Makarenkov and Lior Rokach of Ben-Gurion University share the results of their controlled experiment contrasting transfer-based BERT models with from scratch LSTM models.

Using a number of different pre-trained BERT modules from the TensorFlow Hub that were then fine-tuned for the downstream purpose, both their experiments resulted in the LSTM models outperforming BERT with transfer learning.

 

Experiment 1

The first test was a proper word choice test, for example selecting between alternatives such as “my wife thinks I am a BEAUTIFUL / HANDSOME guy”.

They used a corpus of 30,000 academic articles to train from scratch the bidirectional LSTM model (their baseline) and the same set to fine tune two off the shelf pretrained BERT model, one general purpose and the other trained on the same domain material.

Accuracy was evaluated with the Mean Recurring Rank (MRR) metric.  MRR is a typical metric for any process that creates a list of potential alternatives ordered by the probability of correctness such as selections among most appropriate words.

 

The articles contained domain specific terms which the LSTM learned from scratch but the fine-tuned BERTs failed to differentiate as well.  In fact the BERT models fell well short of the LSTM models.

 

Experiment 2

In the second instance, a binary classification test, the models were tasked to differentiate the political perspective of test articles from the US and British press to determine if they represented a fundamentally Israeli or Palestinian point of view.  The articles were manually annotated by different annotators who agreed on the labels applied.

The general purpose BERT model had been trained on the English Wikipedia and Books.  The LSTM model was also off the shelf and was a general purpose news-domain model. 

 

Once again, in this classification problem, the LSTM was clearly superior.

These results call into question whether you should reach first for a BERT transfer learning approach for your next NLP project.  If performances differences as great as those shown above are important to your project’s success, you may need to look at both approaches.

 

 

Other articles by Bill Vorhies.

About the author:  Bill is Contributing Editor for Data Science Central.  Bill is also President & Chief Data Scientist at Data-Magnum and has practiced as a data scientist since 2001.  His articles have been read more than 2.1 million times.

[email protected] or [email protected]


Credit: Data Science Central By: William Vorhies

Previous Post

How machine learning can help to future-proof clinical trials in era of COVID-19

Next Post

Detecting and Preventing Critical ZeroLogon Windows Server Vulnerability

Related Posts

A Plethora of Machine Learning Articles: Part 2
Data Science

A Plethora of Machine Learning Articles: Part 2

March 4, 2021
The Effect IoT Has Had on Software Testing
Data Science

The Effect IoT Has Had on Software Testing

March 3, 2021
Why Cloud Data Discovery Matters for Your Business
Data Science

Why Cloud Data Discovery Matters for Your Business

March 2, 2021
DSC Weekly Digest 01 March 2021
Data Science

DSC Weekly Digest 01 March 2021

March 2, 2021
Companies in the Global Data Science Platforms Resorting to Product Innovation to Stay Ahead in the Game
Data Science

Companies in the Global Data Science Platforms Resorting to Product Innovation to Stay Ahead in the Game

March 2, 2021
Next Post
Detecting and Preventing Critical ZeroLogon Windows Server Vulnerability

Detecting and Preventing Critical ZeroLogon Windows Server Vulnerability

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

January 6, 2019
Microsoft, Google Use Artificial Intelligence to Fight Hackers

Microsoft, Google Use Artificial Intelligence to Fight Hackers

January 6, 2019

Categories

  • Artificial Intelligence
  • Big Data
  • Blockchain
  • Crypto News
  • Data Science
  • Digital Marketing
  • Internet Privacy
  • Internet Security
  • Learn to Code
  • Machine Learning
  • Marketing Technology
  • Neural Networks
  • Technology Companies

Don't miss it

AI and machine learning’s moment in health care
Machine Learning

AI and machine learning’s moment in health care

March 4, 2021
The Examples and Benefits of AI in Healthcare: From accurate diagnosis to remote patient monitoring | by ITRex Group | Mar, 2021
Neural Networks

The Examples and Benefits of AI in Healthcare: From accurate diagnosis to remote patient monitoring | by ITRex Group | Mar, 2021

March 4, 2021
Welcome to events Thursdays: Thursday’s daily brief
Digital Marketing

Welcome to events Thursdays: Thursday’s daily brief

March 4, 2021
How to Change the WordPress Admin Login Logo
Learn to Code

React authentication, simplified

March 4, 2021
Six courses to build your technology skills in 2021 – IBM Developer
Technology Companies

Kafka Monthly Digest – February 2021 – IBM Developer

March 4, 2021
Microsoft: We’re cracking down on Excel macro malware
Internet Security

Microsoft: We’re cracking down on Excel macro malware

March 4, 2021
NikolaNews

NikolaNews.com is an online News Portal which aims to share news about blockchain, AI, Big Data, and Data Privacy and more!

What’s New Here?

  • AI and machine learning’s moment in health care March 4, 2021
  • The Examples and Benefits of AI in Healthcare: From accurate diagnosis to remote patient monitoring | by ITRex Group | Mar, 2021 March 4, 2021
  • Welcome to events Thursdays: Thursday’s daily brief March 4, 2021
  • React authentication, simplified March 4, 2021

Subscribe to get more!

© 2019 NikolaNews.com - Global Tech Updates

No Result
View All Result
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News

© 2019 NikolaNews.com - Global Tech Updates