Face detection is one of the fundamental applications used in face recognition technology. Facebook, Amazon, Google and other tech companies have different implementations of it. Before they can recognize a face, their software must be able to detect it first. Amazon has developed a system of real time face detection and recognition using cameras. Facebook uses it mostly on photos that their users upload in order to suggest tagging friends.
- Any operating system that will support OpenCV and Python (Windows, Linux, MacOS)
- Haar Cascades Data File
- i3 or higher core processor (CPU)/ 2.1 GHz or higher
- Photo images for testing
I used a 2010 Sony VAIO laptop with an i3 processor 2.1 GHz with 8 GB of memory running Windows 7 Professional with at minimum Service Pack 1 installed. I used Python 2.7.14 and OpenCV 3.4.3 installed on my system. The Haar Cascades data file along with the code will be provided from my GitHub link.
Top 4 Most Popular Ai Articles:
1. Best websites a programmer should visit in 2018 @code_wonders
2. Basics of Neural Network
3. AI, Machine Learning, & Deep Learning Explained in 5 Minutes
4. Six AI Subscriptions to keep you Informed
In this project, I applied face detection to some photos I took using OpenCV with Python. OpenCV is an open source software library that allows developers to access routines in API (Application Programming Interface) used for computer vision applications. The version I used was developed for Python called OpenCV-Python.
To install, make sure you have PIP (Python Package Index) installed with Python and run from the command line (Windows command prompt or Linux/MacOS terminal):
pip install opencv-python
This will install the main modules on your system for use with OpenCV.
What is “Face Detection”?
Face detection is a type of application classified under “computer vision” technology. It is the process in which algorithms are developed and trained to properly locate faces or objects (in object detection, a related system), in images. These can be in real time from a video camera or from photographs. An example where this technology is used are in airport security systems. In order to recognize a face, the camera software must first detect it and identify the features before making an identification. Likewise, when Facebook makes tagging suggestions to identify people in photos it must first locate the face. On social media apps like Snapchat, face detection is required to augment reality which allows users to virtually wear dog face masks using fancy filters. Another use of face detection is in smartphone face ID security.
In this project, I implemented a system for locating faces in digital images. These are in JPEG format only. Before we continue, we must differentiate between face recognition and face detection. They are not the same, but one depends on the other. In this case face recognition needs face detection for making an identification to “recognize” a face. I will only cover face detection.
Face detection uses classifiers, which are algorithms that detects what is either a face(1) or not a face(0) in an image. Classifiers have been trained to detect faces using thousands to millions of images in order to get more accuracy. OpenCV uses two types of classifiers, LBP (Local Binary Pattern) and Haar Cascades. I will be using the latter classifier.
Understanding Haar Cascades
A Haar Cascade is based on “Haar Wavelets” which Wikipedia defines as:
A sequence of rescaled “square-shaped” functions which together form a wavelet family or basis.
It is based on the Haar Wavelet technique to analyze pixels in the image into squares by function. This uses machine learning techniques to get a high degree of accuracy from what is called “training data”. This uses “integral image” concepts to compute the “features” detected. Haar Cascades use the Adaboost learning algorithm which selects a small number of important features from a large set to give an efficient result of classifiers.
As I mentioned earlier, Haar Cascades use machine learning techniques in which a function is trained from a lot of positive and negative images. This process in the algorithm is feature extraction.
The training data used in this project is an XML file called:
For this project I prepared a directory where I dumped all the files needed. You will need to put in this directory the following:
- face_detection.py (the name I gave to the Python program that contains code. This name can be changed.)
- haarcascade_frontalface_default.xml (Haar Cascade training data)
I used my own photos to test with, but are only for testing purpose (I own all copyrights to these photos). The photos were either taken at a public event or from my photography portfolio. For uniformity in testing, I chose an image resolution size of 500 x 331 pixels in JPEG format for each photo tested. I don’t need to test large resolution files for this project, but OpenCV can be used for that as well.
I have a total of 9 images for 7 tests. 7 images with human faces and 2 images of a non-human face to make things more interesting. The purpose of the test is to see the accuracy in face detection using a variety of samples. We tend to get higher accuracy for a more specific type of image, where the placement of the face is very detectable and ideal. Instead I use various images where the face is not always located dead center or ones where there is a group of people (more than 1 face in an image). I also want to find out if there is any bias to the algorithm when it comes to non-human faces, so I used a monkey and a cat photo (a great excuse to view cat pics).
We are going to use the detectMultiscale module from OpenCV. What this does is create a rectangle with coordinates (x,y,w,h) around the face detected in the image. This contains code parameters that are the most important to consider.
scaleFactor: The value indicates how much the image size is reduced at each image scale. A lower value uses a smaller step for downscaling. This allows the algorithm to detect the face. It has a value of x.y, where x and y are arbitrary values you can set.
minNeighbors: This parameter specifies how many “neighbors” each candidate rectangle should have. A higher value results in less detections but it detects higher quality in an image. You can use a value of X that specifies a finite number.
minSize: The minimum object size. By default it is (30,30). The smaller the face in the image, it is best to adjust the minSize value lower.
In order to run OpenCV, just make sure you execute it from the command prompt under the working directory where you dumped all your files. Your computer may not process the code as fast as a more recent configuration if you are using a older generation x86 processor. I had an i3 mobile Intel processor (Arrandale) which has decent performance, but a higher clocked core processor will certainly outperform it. At the command prompt type the following command:
When a face is detected, a green rectangle will be generated around the face. A viewer window will also pop-up to show the results.
Test #1. Single Face
I used two photos with a single face. However, in one photo I have the subject wearing shades. I wanted to test to see whether I can get a face detection even when the face was obscured by an object (the shades).
Both the faces were successfully detected.
Test #2. Two Faces
Next a photo with 2 faces showing different expressions. When there are more faces in an image, certain adjustments need to be made. Since the faces are smaller in the frame, I changed the minSize values until it was able to make the proper face detection. Even with one of the subjects smirking face showing a tongue, the faces were isolated and detected. The algorithm can still detect faces whether neutral or with emotions.
Test #3. Three Faces
When you have more than 3 faces, things become more complicated. Now it requires having to set the parameters to a value that can identify the face correctly. In this photo the subjects are all lined up in the frame, so it would be much easier to locate and detect the faces.
Test #4. Four Faces
In this next photo, there is a group of 4 people. The algorithm must now detect 4 faces. It would appear like I would get a similar result to Test #3, but things were not the same. Instead we see an overlap of 3 faces, but all 4 faces were detected. My guess is that the large rectangle overlap detected the 3 faces as having the appearance of 1 large face for some reason. The two outer faces may have been detected as eyes, while the face in the middle was the nose and it may have then concluded the mouth was the region at the bottom where the hands were located.
This was not the best example of face detection, but this can happen and on other photos could lead to false positives. I did not pursue this one any further, but I’m sure this can still be improved.
Test #5. Five Faces
Now we have a much bigger group of 5 people in the next photo. This combines elements of previous tests with more faces to detect. In the photo you have a people wearing shades, faces with expressions and a group pose. Since larger groups mean smaller faces to detect, I adjusted the value of minSize to (20,20).
All 5 faces were detected in the photo. It seems the closer the faces are together in line, the higher the accuracy in detection.
Test #6. Eight Faces
This is the biggest group photo yet. We have 8 detectable faces in the photo. When I ran the test only 7 out of 8 faces were detected. It seems obvious why 1 face was not detected. As you can see in the photo, the face which was not detected was also not fully exposed. The algorithm could not properly identify the face because in the photo, the person’s hands were partially covering the face. I did have to set the minSize once again to the lowest value (1,1) since we have a much larger group than before and the faces are also much smaller. I realized that the minNeighbors set to 5 would seem inaccurate, but it actually helped with the quality of the image using a 1.1 scaling factor to detect at least 7 faces. I am actually downscaling the image by 10% but keeping the quality of the image higher which allows the algorithm to more accurately detect the faces.
It seems that one way to prevent face detection is to obscure the face in a way where the algorithm cannot gather all the features to locate a face.
Test #7. Non-Human Faces
The first non-human face I tested was from the cat photo. Now I tried using various parameters, but the algorithm had failed to detect the face using the values I entered. Perhaps there is a combination of values in the parameter I have not used that would have detected the face, but for now I am leaving this as unsuccessful. The features were there for the eyes and mouth, but I am assuming this is not readily detectable as a face because the training data from the Haar Cascades classifier may not have enough information. The data was mostly human faces after all.
Now here is the interesting part. I tested a monkey’s face and I was able to get a detection.
Since the monkey is a member of the primate family to which humans belong, the similarities may be the reason why it would be more accurately detected. Certainly more tests can be done with non-human faces to see how it compares with the training data. A more appropriate way to detect animal faces is with a classifier trained using animal photos.
The results lead to no closed conclusion. The code does not automatically detect the faces since the values used for the parameters must be set by the user. Everybody will not get consistent results, they will all depend on the parameter values set for scaleFactor, minNeighbors and minSize. There is no exact value to detecting the number of faces. The user will have to use arbitrary values to come up with face detection. The code can detect faces, but it would still require verification from the user. This is therefore not a fully intelligent system since it requires interaction from the user.
For access to complete code:
Paul Viola and Michael Jones
“Rapid Object Detection Using A Boosted Cascade of Simple Features”