Now let’s use a toy example to understand the practical application of a z-score.
1. 130 Machine Learning Projects Solved and Explained
2. The New Intelligent Sales Stack
3. Time Series and How to Detect Anomalies in Them — Part I
4. Beginners Guide -CNN Image Classifier | Part 1
For the demonstration, I have created a dummy dataset that represents the test scores (in percentage) of some 45 students in a class. The distribution of data is shown below.
Using z-scores we will answer the following questions:
- If student A scored 63% on the test, how well did he or she do as compared to other students?
- What is the probability of a student scoring more than 63%?
From the data, we have our mean = 70.5 and std = 11.4. Our first step is to standardize the data. After standardizing, we get the z-score for 63% as -0.66. (Just impute the values of mean and std in the formula of z-score)
Since our data is normally distributed, we can divide the distribution into 3 zones as shown in the figure below. To answer the first question we have to find the % of the population which falls under zone A.
We can determine the percentage of the population in zone A by using the z table. For a z-score of -0.66, find -0.6 on the left and 0.06 on the top and look up the corresponding area (same as a probability).
This table is used only for negative values of z-scores and gives the area under the curve to the left of z.
The value for -0.66 is 0.2546 which is ~ 25.46%. So A = 25.46%. This suggests that student A received a score better than 25.46% of the total students who took the test. And to know the number of students who scored below 63%, you just multiply that percentage by the total number of students. (i.e. 0.2546 x 45 = 12). Around, 12 students scored less than 63% or student A outperformed 12 other students.
To answer the second question, we need the combined population under zone B and zone C. C is 50% since our data is normally distributed. You can also calculate this from the z-table for positive z-scores. The area under the curve to the right of z = 0.00 is 0.5000 (i.e. 50%).
Assuming B is x %, we have the equation as 25.46 + x + 50 = 100. Solving for x we get x = 24.54%. Therefore, the probability that a student will score more than 63% is 75% (24.54% + 50%).
And that’s everything. In this article, we learned the practical applications of the z-score. Thank you for reading. Get in touch via LinkedIn if you have any questions.