top of page

Limitations of the Viola-Jones Algorithm

For this system, we will be utilizing the Viola-Jones face detection algorithm through the cascade object detection function in matlab.

 

Although this algorithm is reliable, it is not without its limitations.

​

As you can see by the following examples, the algorithm is not perfect and can produce both false positives and false negatives.  Each of the following images were processed through the algorithm and yellow boxes indicate that the algorithm detected a face.  

Screen Shot 2019-03-28 at 12.43.25 PM.pn
Screen Shot 2019-03-28 at 12.43.47 PM.pn
Screen Shot 2019-03-28 at 12.44.04 PM.pn

Due to a variety of factors possibly including the off angles in which the pictures were taken as well as the fact that this individual is wearing glasses result in three false positive faces in the second picture and no detection in the third.  However, the face is clearly identified by the algorithm in the first figure with addition of one false positive on the individual's shirt.  

Merge Threshold

Although the false negatives seen in the previous examples can likely be reduced by ensuring a straightforward camera angle and asking users to remove headwear for these pictures, we are still left with the issue of false positives.  One solution that we found to this issue  is a modification of one of the parameters of the cascade object detection function, merge threshold.  By increasing this the merge threshold of the detector, the user increases the threshold at which the algorithm confirms a face is detected, thus reducing possible false positives.  

 

This can be seen in the following examples.  

Screen Shot 2019-03-28 at 12.35.08 PM.pn
Screen Shot 2019-03-28 at 12.35.42 PM.pn
Screen Shot 2019-03-28 at 12.34.16 PM.pn

As you can see, as the merge threshold is increased, the number of false positives is reduced until only the face of the individual is left detected.

Comparison of Detected Faces

As the final system will be a facial recognition system, the system's job isn't over after the face is detected.  It must also compare the detected face to a database of faces.  To do this, we hope to utilize a comparison of HOG (Histogram of Oriented Gradients) features, which show the sharp edges of individual sections in a picture.  The concept of HOG features can be visualized in the following set of pictures.  The first being a starting photo, the next having the HOG features superimposed, and the final being a closeup of the HOG features of the individual's right eyebrow.

Screen Shot 2019-03-28 at 5.03.11 PM.png
Screen Shot 2019-03-28 at 5.03.19 PM.png
Screen Shot 2019-03-28 at 5.08.53 PM.png

As you can see, the HOG features outline the brow of the individual marking the angles made by it and other facial landmarks.  We hope that by applying HOG features to the detected face of an individual, we can create a comparison tool that can work effectively identify an individual while allowing some room for error caused by minor angle differences between database and input images.

P2-2.PNG

Current Build

In our current build, our system takes in an input image and identifies the face before extracting HOG features and finding a similar image in our database.

However, the system is not perfect.  Due to its design, the system tries to find the best match to the input image from within our database, making it impossible to completely identify a face as not belonging to the set.  Our workaround for this however is to utilize a large database of numerous individuals, many being marked as not having clearance, such that the chance of any one person being identified as someone with security clearance is reduced greatly.  This can be seen in the following image. Although the image was identified as belonging to the dataset, the program did not allow access as the dataset match image was labeled as someone without clearance.

P3.PNG
P1.PNG

And additionally, there are still false negatives, though they are relatively few in number.  Obviously the system misidentified this individual, but that is mostly due to the fact that this is the only picture of the input person in front of the very distinguishable background.

bottom of page