To visualize the features of different categories we use bar charts which are a very simple way of presenting the features. But when we are having a lot of features in one category only then a bar chart will not help. In this case, the Bar Chart Race will come into use. In this article, I will take you through creating an animated bar chart race with Python. A bar chart race helps in visualizing the change in features over time for each category.
Also, Read — How To Earn Money with Programming?
For the task of creating an animated bar chart race, I will be using a GDP per capita forecast dataset provided by the OECD. You can find the original dataset from here.
I’ll start by importing the necessary packages that we need for this task. I will be using Pandas for data management, Matplotlib for charts, Numpy for matrix operations:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import matplotlib.animation as animation
import matplotlib.colors as mc
from random import randint
We only need GDP per Capita as a feature for this task:
df = pd.read_csv("gdp per capita.csv")
df = df.loc[df['Variable'] == 'GDP per capita in USA 2005 PPPs']
df = df[((df.Country != 'OECD - Total') & (df.Country != 'Non-OECD Economies') & (df.Country != 'World') & ( df.Country != 'Euro area (15 countries)'))]
df = df[['Country', 'Time', 'Value']]
df = df.pivot(index='Country', columns='Time', values='Value')
df = df.reset_index()
I add a “^” to the name, so when I display the unit of time, I can easily get rid of whatever is behind that particular character. When the last column of the data block is reached, the code generates an error because the df.iloc instruction cannot select a column (because it will be non-existent). Therefore, I am using a try / except statement:
Also, Read — Visualize Geospatial Data with Python.
When creating animations with Matplotlib, we always need a list of all images, which will be passed to the main function which draws each image. Here we create this list of frames by taking all the unique values from the series “time_unit” and converting it to a list:
1. Machine Learning Concepts Every Data Scientist Should Know
2. AI for CFD: byteLAKE’s approach (part3)
3. AI Fail: To Popularize and Scale Chatbots, We Need Better Data
4. Top 5 Jupyter Widgets to boost your productivity!
The following code assigns the colours of the bar graph. First of all, I am using a function which can turn any colour into a lighter / darker shade. It requires the Colorsys module that we imported at the beginning:
Now I am going to create a bar chart run in this particular time frame with the top elements using the correct colour from the normal_colors dictionary. In order to make the graph look nicer, I will draw a darker shade around each bar using the respective colour from the dark_color dictionary:
Also, Read — Keyword Research Analysis with Python.
The last step in every Matplotlib animation is to call the FuncAnimation method to run the bar chart race:
animator = animation.FuncAnimation(fig, draw_barchart, frames=frames_list)
animator.save('myAnimation.gif', writer='imagemagick', fps=30) plt.show()
Also, Read — First Data Science Project for Beginners.
I hope you liked this article on chart race animation. Feel free to ask your valuable questions in the comments section below. You can also follow me on Medium to learn every topic of Machine Learning.
Also, Read — User Interface with Python.