Sunday, April 18, 2021
  • Setup menu at Appearance » Menus and assign menu to Top Bar Navigation
Advertisement
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
No Result
View All Result
Home Neural Networks

Designing Decision Trees From Scratch on Android

December 20, 2019
in Neural Networks
Designing Decision Trees From Scratch on Android
587
SHARES
3.3k
VIEWS
Share on FacebookShare on Twitter

Decision Trees are one of the most loved 😘 classification algorithms in the world of Machine Learning. They are used for both regression and classification. The most fundamental idea behind a decision tree is to, first, find a root node which divides our dataset into homogenous datasets and repeat until we are left with samples belonging to the same class ( 100% homogeneity ).

With Python packages like Scikit Learn, they are easy to build and run, with a couple of lines of code,

You might also like

AI and Human Rights, A Story About Equality | by bundleIQ | Mar, 2021

The “Blue Brain” Project-A mission to build a simulated Brain | by The A.I. Thing | Mar, 2021

Templates Vs Machine Learning OCR | by Infrrd | Mar, 2021


from sklearn.tree import DecisionTreeClassifier
iris = load_iris()X = pd.DataFrame(iris.data[:, :], columns = iris.feature_names[:])
y = pd.DataFrame(iris.target, columns =["Species"])
tree = DecisionTreeClassifier(max_depth = 2)tree.fit(X,y)

Which makes our life easier for working with Python 😎 at least. But the decision trees could have a number of practical applications and thereby they need to be implemented on various platforms, like Android! In this story, we’ll be implementing a decision tree ( for classification ) in Kotlin ( or Java ).

The Python implementation of this decision tree is adapted from the below stories by Rakend Dubba and I will highly recommend them,

The sample data is also inherited from the stories above.

The code is available on GitHub with a sample APK file-> https://github.com/shubham0204/Decision_Tree_Android

First, as we all know, Kotlin doesn’t have a nice package like Pandas 🐼 to hold our dataset. We can create a simple DataFrame class which holds our data internally in the form of HashMap<String,ArrayList<String>> where the key represents the name of our column and ArrayList<String> holds the data for each sample in the form of String.

For simplicity, we keep all our features as well as labels in the form of Strings.

Later on, we’ll parse data from this class while building our tree.

Snippet 1-Defining the DataFrame class.

Next, we build some internal methods. These methods facilitate calculations and they are not available in Kotlin. The first one is argmax() which returns the index of the greatest value present in the array. The second one, getFreq() , which returns the count or frequency of each element present in the array. The third one logbase2 returns the logarithm with base 2 of the given number.

Snippet 2- Defining secondary methods.

Now, in order to get the total entropy at the root node ( labels ), we use the below method.

Imagine a tree with Entropies and hanging IG scores!

We use Shannon’s Entropy,

The expression for Shannon’s Entropy.

The above entropy will be used for the calculation for the Information Gain,

The calculation of Information Gain ( IG ). Here E_label is the entropy of the root node ( above equation ) and E_feature is the entropy for a feature in our dataset.

The Kotlin method for calculating the root entropy E_label,

Snippet 3 — Defining a method for calculation of root entropy. LABEL_COLUMN_NAME is a String carrying the name of the column which carries the labels.

Next, we need to calculate E_feature for a specified feature. We would feed a featureName and data to the method and it will return the entropy for that feature.

Snippet 4 — Calculation of entropy for a feature.

I’ll explain the above method so as to have an intuition on what’s going on,

The labels carry the labels for all samples. featureValues gets the whole column with featureColumnName from the dataset. We loop through each of the featureValues ( sorted one, no repetition of values! ). We define a variable inside the loop named entropy. This entropy is calculated by,

Entropy for a feature.

The numCount variable refers to the number of samples which featureValue and the corresponding label.

These entropies are summed up and added to another featureEntropy,

Snippet 5 — Finding the entropy of each feature.

Next, we need to find a feature which gives us the highest value of Information Gain. The method is given a data object for which the labelEntropy and featureEntropy to calculate the IG score and store them in an array. We use our argmax function to get the index of the greatest IG score and we finally return the feature name.

Snippet 6 — Retuning a sub-table.

The above code snippet returns a HashMap which contains a sub-table sorted by featureName and featureValue .

Constructing the Tree Recursively.

The most important part of our algorithm. Now, we define a method createTree which will be called recursively till we are left with homogenous datasets i.e with an entropy of 0.

Snippet 7- Constructing the tree.

Let us understand the above method.

  1. When the method is called initially, inputTree is null and data contains the whole dataset ( no subsets ).
  2. We find the highestIGFeatureName and fetch distinct attributes of it. Since our inputTree is null, we create a branch in our tree at line no. 8.
  3. The attributes which we fetched will now be used. We iterate through each of them, get a sub-table for that attribute. Following this, we create in our tree p on lines 21 and 22. Since we aren’t left with homogenous datasets, the method is called recursively at line 22.
  4. After a number of recursions, the length of the counts array will become 1. This denotes that the samples present in subTable[ LABEL_COLUMN_NAME ] are of the same kind i.e there entropy is 0. Here, we break the recursion at line 18. Also, we assign p[ attribute ] the value of clValue[0].

The above method will return an object of type HashMap<String,Any> which represents our tree.

We employ a similar method to predict a label for a given sample. This method takes in the tree ( which we produced in snippet 7 ) and returns a String label. This method is recursive too.

Snippet 8- Making predictions.

The method is called recursively until the object p[ value ]isn’t a HashMap. This means that we have reached the end of a branch where we would find our prediction ( label ).

Decision Trees on Android. Cool Right?

Try running the algorithm in Kotlin and Python. Compare the entropies at each step and you will find that they equal, which indicates that our algorithm is working fine. Thanks for reading and Happy Machine Learning 😃!

Credit: BecomingHuman By: Shubham Panchal

Previous Post

Emails Design Trends for 2020: The Future Is Here [Infographic] : MarketingProfs Article

Next Post

IBM Data Science Elite Team Helps Propel IBM to #1 in Global AI Market Share

Related Posts

AI and Human Rights, A Story About Equality | by bundleIQ | Mar, 2021
Neural Networks

AI and Human Rights, A Story About Equality | by bundleIQ | Mar, 2021

April 17, 2021
The “Blue Brain” Project-A mission to build a simulated Brain | by The A.I. Thing | Mar, 2021
Neural Networks

The “Blue Brain” Project-A mission to build a simulated Brain | by The A.I. Thing | Mar, 2021

April 17, 2021
Templates Vs Machine Learning OCR | by Infrrd | Mar, 2021
Neural Networks

Templates Vs Machine Learning OCR | by Infrrd | Mar, 2021

April 16, 2021
Artificial Intelligence in Radiology — Advantages, Use Cases & Trends | by ITRex Group | Apr, 2021
Neural Networks

Artificial Intelligence in Radiology — Advantages, Use Cases & Trends | by ITRex Group | Apr, 2021

April 16, 2021
A simple explanation of Machine Learning and Neural Networks and A New Perspective for ML Experts | by Akhilesh Ravi | Apr, 2021
Neural Networks

A simple explanation of Machine Learning and Neural Networks and A New Perspective for ML Experts | by Akhilesh Ravi | Apr, 2021

April 15, 2021
Next Post
IBM Data Science Elite Team Helps Propel IBM to #1 in Global AI Market Share

IBM Data Science Elite Team Helps Propel IBM to #1 in Global AI Market Share

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

January 6, 2019
Microsoft, Google Use Artificial Intelligence to Fight Hackers

Microsoft, Google Use Artificial Intelligence to Fight Hackers

January 6, 2019

Categories

  • Artificial Intelligence
  • Big Data
  • Blockchain
  • Crypto News
  • Data Science
  • Digital Marketing
  • Internet Privacy
  • Internet Security
  • Learn to Code
  • Machine Learning
  • Marketing Technology
  • Neural Networks
  • Technology Companies

Don't miss it

What are the different roles within cybersecurity?
Internet Privacy

What are the different roles within cybersecurity?

April 18, 2021
Machine Learning Technology May Help Decipher Biological Language of Cancer, Parkinson Disease
Machine Learning

Machine Learning Technology May Help Decipher Biological Language of Cancer, Parkinson Disease

April 17, 2021
SysAdmin of Billion-Dollar Hacking Group Gets 10-Year Sentence
Internet Privacy

SysAdmin of Billion-Dollar Hacking Group Gets 10-Year Sentence

April 17, 2021
10 Popular Must-Read Free eBooks on Machine Learning
Machine Learning

10 Popular Must-Read Free eBooks on Machine Learning

April 17, 2021
Security crucial as 5G connects more industries, devices
Internet Security

Security crucial as 5G connects more industries, devices

April 17, 2021
Relay Therapeutics pays $85M for startup with a new AI tech for drug discovery
Machine Learning

Relay Therapeutics pays $85M for startup with a new AI tech for drug discovery

April 17, 2021
NikolaNews

NikolaNews.com is an online News Portal which aims to share news about blockchain, AI, Big Data, and Data Privacy and more!

What’s New Here?

  • What are the different roles within cybersecurity? April 18, 2021
  • Machine Learning Technology May Help Decipher Biological Language of Cancer, Parkinson Disease April 17, 2021
  • SysAdmin of Billion-Dollar Hacking Group Gets 10-Year Sentence April 17, 2021
  • 10 Popular Must-Read Free eBooks on Machine Learning April 17, 2021

Subscribe to get more!

© 2019 NikolaNews.com - Global Tech Updates

No Result
View All Result
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News

© 2019 NikolaNews.com - Global Tech Updates