Last year I searched for a proper tensorflow tutorial, but I could not find, It was scattered. Google’s tensorflow documentation is so dull that every 5 minutes I needed caffeine to charge me up. So that’s the tutorial series, where I am trying to cover the tensorflow, Machine learning(supervised and reinforcement learning — Google Dopamine and Gym) and little bit differential calculus. Don’t be afraid if you don’t know Calculus; I will try to ease as much as I can. Differential calculus has a special naughty relationship with machine learning, So I can’t ignore. Hope you will enjoy the show :P, So let’s explore like Columbus.
PS: this is machine learning tutorials, not artificial intelligence. There are differences between Machine Learning and Artificial intelligence. “Machine learning is the study of computer algorithms that improve automatically through experience.” whereas “Artificial intelligence is the science and engineering of making computers behave in ways that, until recently, we thought required human intelligence.”, I hope it clears — #spreadtheawareness.
Let’s start by explaining what Tensorlfow is. Tensorflow is a purely open source computational library which is implemented by Google. Not only machine learning developers are using it, but I have also seen it is widely used in the Genetic Engineering field too, where you need to play with larger arrays, computations. We will use here python for tensorflow. Tensorlfow always creates a directed acyclic graph, DAG. And how the DAG has been executed, we will cover it later. In math, a simple number like 10 or 20, we call it a scalar. A vector(not physics vector) is a 1D array. A matrix(not the movie) is a 2D array, and A tensor is a 3D array. A tensor is an n-dimensional array of data. So here, in Tensorlfow, all the tensors are flowing through the directed acyclic graph. Hence the library is called Tensorflow.
Trending AI Articles:
1. Cheat Sheets for AI, Neural Networks, Machine Learning, Deep Learning & Big Data
2. Data Science Simplified Part 1: Principles and Process
3. Getting Started with Building Realtime API Infrastructure
4. How I used machine learning as inspiration for physical paintings
Why does tensorflow use Directed Acyclic graph to represent the computation? The main answer is Portability. You can build the DAG in python using higher configuration CPU, GPU or TPU, store it in the saved model, and the same can be restored using C++ in low latency machine like mobile or raspberry. So it will give you the language and hardware portability. This is very similar like JVM, how java works.
Now let’s start looking at the tensorflow hierarchy. Like most of the software libraries, tensorflow has numbers of abstraction layers. The lowest level is implemented to target the different types of hardware; The next level is Tensorflow C++ layer. Then the next layer is a wrapper python layer, is what contains much of the numeric processing code, add, subtract, divide, matrix multiply, creating variables, creating tensors, getting the shape, all the dimensions of a tensor, all that core basic numeric processing stuff, etc. Then, in the next layer, there are a set of Python modules that have a high-level representation of useful neural network components, for example, a way to create a new layer of hidden neurons, with a real activation function. It’s in tf layers, a way to compute the root mean square error and data as it comes in, tf metrics, a way to calculate cross entropy with Logics, and it’s in tf losses. These models provide components that are useful when building custom NN models. And then the highest layer — The estimator. It knows how to do this to be the training; it knows how to evaluate how to create a checkpoint, how to Save a model, how to set it up for serving. It comes with everything done sensibly, that fits most machine learning models and production. So, if you see example TensorFlow code on the Git, and it doesn’t use the estimator API, ignore that code, walk away, it’s not worth it. In the example, I will use non-estimator code first then will provide the estimator code later.
Lets first write the code for Numpy, if you want to add two arrays in numpy what we will do first, we will create two arrays then the addition operation will be calculated, simple right?
So now, let’s look at the tensorflow code on this part. if you want to add two tensors a and b.So, you write tf.add(a, b). It returns a tensor c. Unlike typical Python code though, running the tf.add doesn’t execute it, it only builds the DAG. In the DAG in the directed acyclic graph, a, b, and c are tensors and add is an operation. In order to run this code, in order to execute the DAG, you need to run it and you run it as part of what is called a session. So, you say that you want a value of c and you ask the session, “Hey session, please evaluate c for me.”So, that’s what runs the DAG, and then you get back a traditional numeric array in Python that contains the values for c. Programming TensorFlow involves programming a DAG.So, there are two steps. The first step, create the graph. The second step, run the graph. The graph definition is separate from the training loop because this is a lazy evaluation model. It minimizes the Python to C++ context switches and enables the computation to be very efficient. Note that c, after you call tf.add, is not the actual values. You have to evaluate c in the context of a TensorFlow session. So, to reiterate, TensorFlow does lazy evaluation. You write a DAG and then you run the DAG in the context of a session to get results.
Now, there is a different mode in which you can run TensorFlow.It’s called tf.eager and in tf.eager, the evaluation is immediate and it’s not lazy. But eager mode is typically not used in production programs. It’s typically used only for development. We’ll look at tf.eager a little bit later in this course, but for the most part, we’ll focus on the lazy evaluation paradigm. And almost all the code that we write and we run in production will be in lazy evaluation mode. So now if anyone asks you why tensorflow follows the lazy evaluation, you can give him an answer. Why does TensorFlow do the lazy evaluation?
The lazy loading has some other advantages also. In the Directed acyclic graph, DAG there are many edges and nodes. The edges represent data, they represent tensors, which as we now know, are n-dimensional arrays. The nodes represent the tensorflow operation on the edges, like addition or multiplication. The lazy loading allows a lot of flexibility and optimization where you will execute a larger computation graph, such as if there are two consecutive ‘Add’ nodes, tensorflow can optimize to a single ‘Add’ Node. Tensorflow can assign different parts of DAGs to different devices depending on the device compatibility.
Now let’s see, how can we use the tf.eager. In the development environment when we debug the code, sometimes we need to see the output after an operation. It is highly recommendable not to use tf.eager in the production environment. Let’s see the code without tf.eager, then we will replicate the same logic with tf.eager
Now with tf.eager
So far we have seen, how to write the graph and run it. Now we will see how to visualize the graph. Every once in a while, you want to visualize the graph, you want to see the operations, what data feeds into it, et cetera. You might also want to visualize the architecture of your neural networks. So, we will use tf.summary.FileWriter to write out the session graph. In the below code we have used a variable called “writer” to handle the logic. After excution, the below code a folder called “output” will be generated in the same folder. Don’t forget to close the writer variable before closing the main execution because it is expensive object.
For understanding the graph on the browser the code “tensorboard — logdor=<SummaryFileName>” in the terminal. usually, the tensorboard page will be accessible at the http://localhost:6006.