Facial recognition seemed to be a bit daunting to me. OpenCV Facial Recognition Tutorial and OpenCV Facial Recognition in Video. The process seemed very convoluted and time consuming. First you had to get the training images, put in a directory, create a spreadsheet, crop the faces, adjust the angle, create zombie outlines etc. before you actually train the system to recognize a person

I saw things differently — as the way a robot should ideally work. First the robot would monitor for any movement. If there is no movement then there isn’t likely to be any need for facial recognition. From the movement a face should be detected and that face should be saved and the raspberry pi should be able to sort faces into same or different ie recognize the differences. Same as a baby probably recognizes differences between their family members.

I also had a different idea of a project. If a robot was asked to find out about a certain celebrity it should be able to search through the internet for pictures of that person and learn to recognize them. You should then be able to show a picture of the celebrity and the robot would be able to tell you who it was. I saw this as part of a general information gathering exercise. The robot would look up the information on wikipedia and try and understand this information eg “Angelina Jolie is an American actress, filmmaker, and humanitarian” would set the program to find out what each of the words “American”, “actress”, “filmmaker”, “humanitarian” and attempt to build up a semantic net of information.

Then I came across DLIB in this article machine-learning-is-fun-part-4-modern face recognition with deep learning and also Raspberry Pi Face Recognition. DLIB is pre-trained to detect faces and has commands to do all the cropping, pose rotation. But, and this is the important thing — it can also create a list of encodings defining features of a face. Each face produces a different set of encoding values. So if you compare a new face encodings against existing face encodings it will either be very close in which case we can assume that it is the same person or further away so we can say it is a different face. All that is needed is that the user puts a name to the recognized face.

Face Recognition from Motion Detect

So first of all the motion detect. This code uses the multiproc environment and motion detect systems described in previous articles. By placing the facial recognition module in the chain the [readcam, motiondetect, buffer, facerecognition, writevideo] or the writevideo can be substituded for displayvideo. This creates a library of encodings of anyone caught on camera which can then be edited to put your name in or leave as face-n.

Face Recognition of Celebrities

I created a module that downloads images from the web given a search term. So we can call this module with “angelina jolie” and it will pass a number of images to the facial recognition module which will then store the facial encodings for the faces discovered. You don’t even need to save the images.

However, as you have probably guessed, calling google with the term “angelina jolie” doesn’t necessarily return pictures of the actress in question – you will also get pics of other people – Brad Pitt, various children, other celebrities. But this doesn’t matter as the system will collate people to their face encodings. At present you still manually edit the labels and add the names, however you could fairly safely make the assumption that the most popular face will be that of the celebrity you requested and automate that.

I have also tried this with band names and yes, you can use “the beatles” or the “the clash” and the system will recognize them as face1 to 4 (plus a few extras).

Face Recognition from YouTube

I have also created a module which gets a video from YouTube. This uses Pafy (and ffmpeg if you want the sound). Using the same face encodings from the previous example the system can identify Angelina Jolie and Brad Pitt in Mr and Mrs Smith such as the two of them dancing to Mundo Bongo sung by Joe Strummer or the Tango scene in the restaurant.


Coral Object Recognition

If you have the Coral USB accelerator which is used for object detection then you can use this within the multiproc environment. For example you can use the google image search to get images of greyhounds and get the object detector to crop the greyhounds in the image and place in a folder.

Further work

The facial recognition does push the raspberry pi to it’s limits when using video. You have a choice of either working on larger sizes and getting greater detection but slower speeds. Or the other choice of smaller frame sizes but not much facial detection. With the motion detect one way to speed up would be to cut down the ROI (region of interest) to that of the movement and pushing that to the face recognition. However that doesn’t allow identifying the face in the original frame and displaying that. That is something I will look at. I’m also looking at speeding up the search of the feature encodings by indexing according to the number of times a face has been recognized and also gradually deleting off the very rarely used faces.

DLIB isn’t perfect. It sometimes has problems detecting faces if the light or definition isn’t good. It also sometimes fails to recognize previously known faces and conversely sometimes mismatches different faces. The first problem isn’t too difficult to overcome – it means the celebrity gets added twice – but labels can be duplicated so this isn’t noticeable. The second problem would require tightening the matching weight. This though will likely cause more of the first problem.

The code is now available at selftrainingrecognition. It also requires download of multiprocenv. I welcome any feedback.