During the 21st century, the revolution in data storage techniques reduces the storage cost of the data as a result the amount of data generated is growing exponentially. It is estimated the by the end of the 21st we will have 44 zettabytes of data. Every action we perform generates data viz. click of a button on our phone, social media, and numerous other activities.
The algorithm we are using such as Naïve Bayes, KNN clustering, etc. has roots back in the 1960s but due to the technology barrier was not able to implement this algorithm. But in the 21st-century lot of innovation has been taken place — result hardware has been evolved.
“Artificial Intelligence” term was invented by John McCarthy in 1956 at Dartmouth. A.I is basically data-driven which requires a certain amount of data in order to discover the underlying pattern and relation by training. The ML or DL models trains on these data labeled, unlabeled, or observation using reward and try to predict or classify.
Below are the techniques which are used in ML and as well as DL, classified on basis of the data.
As we have seen in Fig.2 that algorithms are divided in accordance with the data we feed in the model.
As the name suggests “supervised”, in this the training and as well the testing of the model is done via. labeled example and with ain to minimize the loss function. We all have studied in a school where the teacher used to pinpoint at every alphabet at making us recognize its features until we are not able to recognize the alphabet on our own.
Similarly in a supervised learning algorithm generally dataset is split up into three parts test set, training set, and evaluation set in ratio 70:20:10 respectively. Each dataset is labeled i.e each row is mapped to the corresponding label for example if we are a training a model for the classification of cats and dog each image has its label, these images are sent to the models where forward propagation takes place and then the prediction is made by these models. Then this prediction is used to calculate the loss (using MSE, log loss, and other loss functions) then the loss is propagated backward and weights, as well as biased, are adjusted by doing differentiation of the loss function with respect to bias and weight and using the learning rate the weights are adjusted until the global minimum is reached. In the further section of this, we will be studying these terms in detail with hands-on.
Type of supervised algorithm is Linear regression, Logistic regression, Decision tree, Neural Network, SVM and etc.
As the name suggests unsupervised means the model extracts the underlying pattern or information from the dataset and group them into clusters. In today’s data-driven era the rate of generation of the data is exponential, while reading this post also you would have generated GB’s of data. So, for such amount of data labeling is very tedious and as well as expensive- an economically timely process.
2. The Unfortunate Power of Deep Learning
3. Graph Neural Network for 3D Object Detection in a Point Cloud
4. Know the biggest Notable difference between AI vs. Machine Learning
Here unsupervised learning plays a very crucial role i.e. they discover the underlying pattern and information from the data so that similar data can be grouped together. Unlike supervised learning, no teacher is provided that means no training will be given to the machine. Therefore the machine is restricted to find the hidden structure in unlabeled data by our-self. This algorithm does not have any loss function which helps to backpropagate the errors to adjust the parameters.
Some of the unsupervised algorithms are Isomap, LLC, PCA, DBSCAN, K means clustering, K++ means clustering, Apriori, and various other algorithms.
In recent years due to the tedious and expensive process — labeling of data a new classification of algorithm came into existence i.e. semi-supervised learning. Supervised learning needs a huge amount of labeled data and unsupervised learning work of unlabelled data which is not accurate as well as needs huge computation. In order to tackle the cons of supervised and unsupervised learning, Semi-supervised learning came into existence.
Semi-supervised learning is a combination of both supervised and unsupervised learning. In this type of learning, the algorithm is trained upon a combination of labeled and unlabelled data. This combination will contain a very small amount of labeled data and a very large amount of unlabeled data. The basic procedure involved is that first, the programmer will cluster similar data using an unsupervised learning algorithm and then use the existing labeled data to label the rest of the unlabeled data.
Some of the semi-supervised algorithms are similarity learning, distance-based learning, etc.
In this learning algorithm, the underlying base of the algorithm is the same i.e Deep Learning is used but here no dataset is given to the network. Instead, the algorithm is composed of Agent, Environment, Reward, Status/observation, and Policy.
The Agent acts on the environment on the principle defined by the respective policy, as a result, the environment gives reward to the agent, and the status of the environment changes. This process is repeated several times until the agent is able to take the correct and necessary decision with respect to the status or his observation on the environment in order to increase the reward count.
Type of supervised algorithm is the Markov process, Q-Learning, DQNN, and various other algorithms.
During our initial stages, we have face several environmental issues i.e. how to run scripts where to run and what packages need to be install. As discussed in the blog how to set up the environment with jupyter notebook.
Jovian.ml is a platform that enables us to share our notebook. Jovian ipython notebook is compatible with Google Colab, Kaggle, and Binder. Binder is a server on which we can run the normal python scripts i.e. without jupyter whereas Kaggle and Colab are the environments that provide us free GPU. We can use Jovian.ml by creating an account and accessing its API. For a basic tutorial follow this link and jovian.ml docs.
As we say “Car is useless if it doesn’t have a good engine” similarly student is useless without proper guidance and motivation. I will like to thank my Guru as well as my Idol “Dr P. Supraja”- guided me throughout the journey, from bottom of my heart. As a Guru, she has lighted the best available path for me, motivated me whenever I encountered failure or roadblock- without her support and motivation this was an impossible task for me.
AI: Artificial Intelligence
ML: Machine Learning
DL: Deep Learning
Model: The algorithm we use in deep learning or Machine learning such as linear regression, NN, CNN is known as models and training data is fitted into this model for prediction or classification.
If you have any query feel free to contact me on any of the below-mentioned options:
Google Form: https://forms.gle/mhDYQKQJKtAKP78V7
Jupyter Notebook: https://jovian.ml/tiwari12-rst/installationguide
Data Generation: https://www.weforum.org
Enviroment Set-up: https://firstname.lastname@example.org
Jovian Notebook: https://jovian.ml/tiwari12-rst/installationguide