Friday, January 22, 2021
  • Setup menu at Appearance » Menus and assign menu to Top Bar Navigation
Advertisement
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
No Result
View All Result
Home Neural Networks

Transfer learning in NLP using fastai – Becoming Human: Artificial Intelligence Magazine

April 10, 2019
in Neural Networks
Transfer learning in NLP using fastai – Becoming Human: Artificial Intelligence Magazine
586
SHARES
3.3k
VIEWS
Share on FacebookShare on Twitter

Credit: BecomingHuman

…and why it works better than n grams?

Find the full jupyter notebook here.

You might also like

Top 5 Artificial Intelligence (AI) Trends for 2021 | by James Montantes | Jan, 2021

6 Major AI Use Cases In IT Operations | by Gina Shaw | Jan, 2021

Classifying employees as likely-to-quit using Tensorflow, Pandas & IBM attrition dataset | by Timilsinasandesh | Jan, 2021

Traditional approaches

Previous approaches to sentiment analysis involved using something called as n-grams. We would first convert our text into tokens (or vocabulary) and then use those tokens to represent the sentences in our text as a sparse matrix.

For example if our vocabulary went like ["the", "it", "was", "when", "which","edit","introduction", "best", "video", "time"] and we saw a sentence like “it was the best time” then this sentence would be represented as [1,1,1,0,0,0,0,1,0,1] . This would be done for every sentence in our data and the final matrix would be used for training. However, this approach would misclassify things like not good for which we would use a combination of words as tokens (“it was”, “was the” and so on.) Find the full jupyter notebook here to learn more about this and similar approaches.

Trending AI Articles:

1. Deep Learning Book Notes, Chapter 1

2. Deep Learning Book Notes, Chapter 2

3. Machines Demonstrate Self-Awareness

4. Visual Music & Machine Learning Workshop for Kids

An important drawback

The drawback of using this approach, of representing text as tokens is that, essentially, it cannot understand English. The structure of the sentence is not really taken into consideration, only the frequency of the words is used. It does not know the difference between “I want to eat a hot __” and “It was a hot ___”. And because it cannot understand English, it cannot understand movie reviews and identify whether someone really liked a movie or not.

In order to overcome this drawback, we will once again resort to transfer learning. We will use a pre-trained model called language model. What is it and how do we train a model using it? Let’s find out.

The dataset

For this article, we will be using Cornell’s movie review datasetv2.0 which has 1000 positive and 1000 negative reviews. Using this data, we create our own data bunch as follows:

Creating a data bunch creates a separate token for every word. Most of the tokens are words but some of them are also question marks and braces and apostrophes.

We now find all the unique tokens and calculate their frequencies. This big list of tokens is called vocab. Here’s the first ten in order of frequency:

We see a lot of junk words here starting with xx. Here’s the thing.

Every word in our vocab is going to require a separate row in a weight matrix in our neural net. So to avoid that weight matrix from getting too huge, we restrict the vocab to no more than (by default) 60,000 words. And if a word doesn’t appear more than two times, we don’t put it in the vocab either.

In this way, we keep the vocab to a reasonable size. When you see these xxunk, they’re actually unknown tokens. It just means this was something that was not a common enough word to appear in our vocab. There are some special tokens though.

For example, xxfld: This is a special thing where if you’ve got like title, summary, abstract, body, (i. e. separate parts of a document), each one will get a separate field and so they will get numbered (e.g. xxfld 2 ).

Let’s now look at one review and it’s corresponding data representation as an array (token numbers).

Now that we have our data in place, we can start thinking about modelling. Our approach will look like this:

As discussed earlier, instead of just using one bit for every word and then deciding whether a person liked a movie or not, we want our model to learn some English. For this we are going to require a bigger set of documents than just our review data.

Wikitext-103 is a subset of most of largest articles from Wikipedia with a little bit of processing, and is available for download. Jeremy from fastai used this dataset, and built a language model on it.

Language model

A language model is a model that predicts next word in a sentence. To predict the next word in a sentence, you need to know quite a lot about the English language. By this we mean, being able to complete the following sentences:

I want to eat a hot __. (dog), It was a hot __. (day)

So Jeremy built a neural net, which will predict next word in every significantly sized Wikipedia article. And that’s a lot of information. Something like billions of tokens. So we got billions of words to predict, we make mistakes in those predictions, we get gradients from that, we can update our weights and we can try to get better and better until we get pretty good at predicting next words in Wikipedia.

Why is this useful?

Because at that point we have got a model that knows how to complete sentences like this. So it knows quite a lot about English and a lot about how the world works. The model can also tell who the president was during different years.

On top of this language model, we show it our own data (without the labels).

This way we train it from being good at predicting words in Wikipedia articles to being good at predicting words in movie reviews (our specific case).

After training for a few epochs, we get an accuracy of ~30% which is pretty good for a language model.

Let’s try to make some predictions using this language model.

And one more!

Notice that the output does not make much sense nor is it grammatically correct but it sounds vaguely like English. We can now save the encoder part of this model (the part that understands English) and use it as a pretrained model.

We use this encoder and the data bunch we created earlier (with the labels) to train our model. We’ve managed to achieve a 92% accuracy. Let’s take a look at some of the predictions.

With a larger corpus, we can train it even better to understand all kinds of reviews. But this right here, is still really good.

If you found this article useful give it atleast 50 claps.

~happy learning.

Don’t forget to give us your 👏 !

Credit: BecomingHuman By: Dipam Vasani

Previous Post

Google Chrome engineers want to block some HTTP file downloads

Next Post

Better understand AI, Machine Learning, chatbots, and Amazon Alexa for only $35

Related Posts

Top 5 Artificial Intelligence (AI) Trends for 2021 | by James Montantes | Jan, 2021
Neural Networks

Top 5 Artificial Intelligence (AI) Trends for 2021 | by James Montantes | Jan, 2021

January 22, 2021
6 Major AI Use Cases In IT Operations | by Gina Shaw | Jan, 2021
Neural Networks

6 Major AI Use Cases In IT Operations | by Gina Shaw | Jan, 2021

January 21, 2021
Classifying employees as likely-to-quit using Tensorflow, Pandas & IBM attrition dataset | by Timilsinasandesh | Jan, 2021
Neural Networks

Classifying employees as likely-to-quit using Tensorflow, Pandas & IBM attrition dataset | by Timilsinasandesh | Jan, 2021

January 21, 2021
Does AI Raises Security and Ethics Concerns amid Pandemic | by Divyesh Dharaiya | Jan, 2021
Neural Networks

Does AI Raises Security and Ethics Concerns amid Pandemic | by Divyesh Dharaiya | Jan, 2021

January 21, 2021
How to Deploy AI Models? — Part 2 Setting up the Github For Herolu and Streamlit | by RAVI SHEKHAR TIWARI | Jan, 2021
Neural Networks

How to Deploy AI Models? — Part 2 Setting up the Github For Herolu and Streamlit | by RAVI SHEKHAR TIWARI | Jan, 2021

January 20, 2021
Next Post
Better understand AI, Machine Learning, chatbots, and Amazon Alexa for only $35

Better understand AI, Machine Learning, chatbots, and Amazon Alexa for only $35

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

January 6, 2019
Microsoft, Google Use Artificial Intelligence to Fight Hackers

Microsoft, Google Use Artificial Intelligence to Fight Hackers

January 6, 2019

Categories

  • Artificial Intelligence
  • Big Data
  • Blockchain
  • Crypto News
  • Data Science
  • Digital Marketing
  • Internet Privacy
  • Internet Security
  • Learn to Code
  • Machine Learning
  • Marketing Technology
  • Neural Networks
  • Technology Companies

Don't miss it

Ransomware victims aren’t reporting attacks to police. That’s causing a big problem
Internet Security

Hackers publish thousands of files after government agency refuses to pay ransom

January 22, 2021
Missing Link in a ‘Zero Trust’ Security Model—The Device You’re Connecting With!
Internet Privacy

Missing Link in a ‘Zero Trust’ Security Model—The Device You’re Connecting With!

January 22, 2021
Remote Learning Boosting Adoption of Innovative Technologies for Education 
Artificial Intelligence

Remote Learning Boosting Adoption of Innovative Technologies for Education 

January 22, 2021
Machine Learning & Big Data Analytics Education Market 2026| Querium • Knewton • Third Space Learning • Blackboard • Fishtree • Cognizant
Machine Learning

Machine Learning & Big Data Analytics Education Market 2026| Querium • Knewton • Third Space Learning • Blackboard • Fishtree • Cognizant

January 22, 2021
Windows RDP servers are being abused to amplify DDoS attacks
Internet Security

Windows RDP servers are being abused to amplify DDoS attacks

January 22, 2021
With New Healthcare Tech Relying on Data Sharing, Trust is Required 
Artificial Intelligence

With New Healthcare Tech Relying on Data Sharing, Trust is Required 

January 22, 2021
NikolaNews

NikolaNews.com is an online News Portal which aims to share news about blockchain, AI, Big Data, and Data Privacy and more!

What’s New Here?

  • Hackers publish thousands of files after government agency refuses to pay ransom January 22, 2021
  • Missing Link in a ‘Zero Trust’ Security Model—The Device You’re Connecting With! January 22, 2021
  • Remote Learning Boosting Adoption of Innovative Technologies for Education  January 22, 2021
  • Machine Learning & Big Data Analytics Education Market 2026| Querium • Knewton • Third Space Learning • Blackboard • Fishtree • Cognizant January 22, 2021

Subscribe to get more!

© 2019 NikolaNews.com - Global Tech Updates

No Result
View All Result
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News

© 2019 NikolaNews.com - Global Tech Updates