Face Recognition on Webcam with CV2

This tutorial conveniently makes use of opencv (cv2) library in Python.

You can learn how to get a Webcam input, make use of While Loops, implement Face Recognition, use pre-trained xml files and discover opencv library.

If you’d like to see a Screen Capture version of this face recognition Python tutorial instead of Webcam just check out this post here.

Used Where?

  • To get Webcam Input
  • Face Recognition
  • Interacting with games and software
  • Research
  • Security
  • Surveillance

Let’s import cv2 library:

import cv2

Here is a very straightforward implementation of face recognition using Python’s VideoCapture module from cv2 library.

Estimated Time

10 mins

Skill Level

Intermediate

Modules

VideoCapture, detectMultiScale, CascadeClassifier, cv2.rectangle, cv2.destroyAllWindows, cv2.imshow, cv2.waitKey

Libraries

cv2

Tutorial Provided by

HolyPython.com

Classifier xml files (already trained)

We need already trained xml files that will facilitate the algorithms for face recognition. You can simply assign them to a variable using cv2.CascadeClassifier.

You will need one xml file for each eye recognition and face recognition as below:

You can get the files here.
You can find bunch of different, high quality classifiers also on official Github page of opencv here. Also go ahead and support this awesome open source project if you can.

face_cascade = cv2.CascadeClassifier(r'c:\Users\tt\Desktop\haarcascade_frontalface_default.xml') 
eye_cascade = cv2.CascadeClassifier(r'c:\Users\tt\Desktop\haarcascade_eye.xml')  

cv2.VideoCapture for Webcam input

Now we can create a Video Capture input of your screen using VideoCapture module of cv2 library as following:

cap = cv2.VideoCapture(0)

Core of the code

At this point all we need is to create a while loop in order to use the face recognition 

while True:  
    ret, img = cap.read()  
    gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) 
    faces = face_cascade.detectMultiScale(gray_img, 1.25, 4) 
  
    for (x,y,w,h) in faces: 
        cv2.rectangle(img,(x,y),(x+w,y+h),(255,255,0),2)  
        rec_gray = gray_img[y:y+h, x:x+w] 
        rec_color = img[y:y+h, x:x+w] 
  
        eyes = eye_cascade.detectMultiScale(rec_gray)  
  
        for (0x,0y,0w,0h) in eyes: 
            cv2.rectangle(rec_color,(0x,0y),(0x+0w,0y+0h),(0,127,255),2) 
  
    cv2.imshow('Face Recognition',img) 
  
    k = cv2.waitKey(30) & 0xff
    if k == 27: 
        break
  
cap.release() 
cv2.destroyAllWindows()
Webcam Face and Eye recognition (Demonstration only, image taken from John Krasinski's SGN show)

Code breakdown

Now let’s break down the code above and analyse some critical lines for learning:

  1. Infinite While loop reads your webcam continously.
  2. Using image from the webcam input, a gray version is created.
  3. face_cascade.detectMultiScale is used to facilitate object detection based on trained face recognition file.
    • This method take a few parameters that can be important. First one (gray here) is the gray version of our image input from the webcam.
    • Secondly, scaleFactor helps reduce image sizes from the input. This parameter can be important because trained data has a fixed size for each face and you might want to use reduction by a scale factor so that your algorithm runs faster. For instance: 1.04 will only reduce input image sizes by 4% but this will take more computing resources. On the other hand you can use something like 1.5 which will use a reduction factor of 50% and this will be way more efficient however, this may cost some of the faces to be not recognized in your input. This is definitely something to experiment with to find an optimal value.
    • minNeighbors is another important parameter. This signifies the minimum neighbor each rectangle needs to have for a face to be detected. Usually takes a value between 3-6
  4. Next is to draw the rectangles on each face.
  5. After that, another for loop is used to recognize the eyes for each face and then draw the boxes for the eyes similarly.
  6. Finally using imshow() method you can show the image with the boxes of face recognition and eye recognition.
  7. k variable is defined to help exit the Python webcam window by pressing “ESC” key which is mapped to number 27.
  8. At last Webcam input is released and  all cv2 windows are terminated to clean up the operation.

Conclusion

Voila! It’s not as intimidating as it looks. Especially in Python code.

Do you have any innovative ideas about what else can be done using this tech?

Don’t worry if you don’t have any. Just try to replicate the code by yourself and this process usually inspires people and helps get inspired for more ideas.

Also it’s good coding practice and pure fun!

Face Recognition on Screen Capture with CV2

This tutorial conveniently makes use of opencv (cv2) library in Python combined with PIL library’s ImageGrab for screen capture and numpy’s numpy.array to get a digital array from visual input.

You can learn how to get a continuous Screen Capture using PIL’s ImageGrab.grab, make use of While Loops, For Loops and User Defined Functions, implement Face Recognition, use pre-trained xml files and discover opencv library, discover opencv’s Canny edge recognition function.

Used Where?

  • To capture screen
  • Face Recognition
  • Interacting with games and software
  • Research
  • Security
  • Surveillance

Let’s import cv2 and time libraries:

from PIL import ImageGrab
import cv2
import numpy as np

Here is a very straightforward implementation of face recognition using Python’s VideoCapture module from cv2 library.

If you’d like to see a Webcam input version of this face recognition Python tutorial instead of Screen Capture just check out this post here.

Estimated Time

15 mins

Skill Level

Intermediate

Modules

ImageGrab, np.array, detectMultiScale, CascadeClassifier, cv2.rectangle, cv2.destroyAllWindows, cv2.imshow, cv2.waitKey

Libraries

PIL, cv2, numpy

Tutorial Provided by

HolyPython.com

Classifier xml files (already trained)

  • We need already trained xml files that will facilitate the algorithms for face recognition. You can simply assign them to a variable using cv2.CascadeClassifier.
  • You will need one xml file for each eye recognition and face recognition as below:
  • You can get the files here.
  • You can find bunch of different, high quality classifiers also on official Github page of opencv here. Also go ahead and support this awesome open source project if you can.
face_cascade = cv2.CascadeClassifier(r'c:\Users\tt\Desktop\haarcascade_frontalface_default.xml') 
eye_cascade = cv2.CascadeClassifier(r'c:\Users\tt\Desktop\haarcascade_eye.xml')  

cvtColor and Canny methods of opencv

  • A helper function called process_img, which will later be called under the main function below, converts image to gray for processing convenience and also implements Canny method which is a sophisticated edge recognition function. 
  • Using Canny method is spoiling as it does all the complicated science and tech in one word but it is a very interesting machine learning concept. You can read more about it on the official open cv page here. What it does basically is it helps define the edges of objects in the visual.
def process_img(image):
    original_image = image
    processed_img = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    processed_img =  cv2.Canny(processed_img, threshold1 = 200, threshold2=300)
    return processed_img

Main function for face/eye recognition and rectangle drawing

Our main function, also called main(),  is going to include a while loop that contains pretty much the whole code and it’s also going to call the process_img function.

def main():

    while True:
        screen = np.array(ImageGrab.grab(bbox=(0,40,1200,1200)))
        new_screen = process_img(screen)
        faces = face_cascade.detectMultiScale(screen, 1.35, 5)
    
        for (x,y,w,h) in faces:
            
            cv2.rectangle(screen,(x,y),(x+w,y+h),(255,255,0),2)
            roi_gray = screen[y:y+h, x:x+w]
            roi_color = screen[y:y+h, x:x+w]
      
            eyes = eye_cascade.detectMultiScale(roi_gray)
      
            for (0x,0y,0w,0h) in eyes:
                cv2.rectangle(roi_color,(0x,0y),(0x+0w,0y+0h),(0,127,255),2)
      
        cv2.imshow('img',screen)
        
        if cv2.waitKey(25) & 0xFF == ord('q'):
            cv2.destroyAllWindows()
            break

main()
Face and eye recognition in Python using open cv

Code breakdown

Now let’s break down the code above and analyze some critical lines for learning:

  1. Under the main function
    1. Infinite While loop grabs your screen continously.
    2. A helper function process_img is called to convert image input to gray scale and implement edge detection using Canny.
    3. face_cascade.detectMultiScale is used to facilitate object detection based on trained face recognition file.
    4. Secondly, scaleFactor helps reduce image sizes from the input. This parameter can be important because trained data has a fixed size for each face and you might want to use reduction by a scale factor so that your algorithm runs faster. For instance: 1.04 will only reduce input image sizes by 4% but this will take more computing resources. On the other hand you can use something like 1.5 which will use a reduction factor of 50% and this will be way more efficient however, this may cost some of the faces to be not recognized in your input. This is definitely something to experiment with to find an optimal value.
    5. minNeighbors is another important parameter. This signifies the minimum neighbor each rectangle needs to have for a face to be detected. Usually takes a value between 3-6
    6. Rectangles are drawn.
    7. After that, another for loop is used to recognize the eyes for each face and then draw the boxes for the eyes similarly.
  2. Finally using imshow() method you can show the image with the boxes of face recognition and eye recognition.
  3. Screen Input window is to be closed by pressing “q”.
  4. At last Webcam input is released and  all cv2 windows are terminated to clean up the operation.
  5. main function is called in the last line as main().

Conclusion

  • That’s all it takes to write a face recognition code in Python. Obviously it’s a high level implication but that doesn’t mean it’s not programming. If you’d like to dig deeper in this topic check out how face recognition technology works and how xml files represent the training that facilitates the application of this code.
  • This is a twist of webcam face recognition code that was published here. Using this version you can actually run your code on whatever is on your screen. This might enable further ideas such as analyzing faces in a movie, tv stream, or personal videos.
  • Another difference of this tutorial is that, it’s slightly tidier employing the user defined functions: main and processed_img, a great practice for user defined functions indeed.

Optical Character Recognition (OCR)

ABSTRACT

  • In this tutorial we will take a closer look at pytesseract module and discover some of its powerful features. You will be able to understand basic optical character recognition in a very simple form.

  • We will also use PIL library for some image manipulation methods with Python, including: image opening, image displaying, image type conversion, etc.

TUTORIAL

Let’s start with importing the libraries we’re going to need.

import PIL
from PIL import Image
import pytesseract
		

Here is some info about PIL

NAME
    PIL - Pillow (Fork of the Python Imaging Library)

DESCRIPTION
    Pillow is the friendly PIL fork by Alex Clark and Contributors.
        https://github.com/python-pillow/Pillow/

Here is some info about pytesseract

pytesseract is a very popular library for its optical character recognition capabilities. Sometimes, depending on your setup you might need an extra line for pytesseract to work properly. Just find your pytesseract installation directory and point to it with the code below. Note that directory can be different depending on your local setup and you may or may not have to exclude the last bit such as:

r”C:\Users\USA\Anaconda3\Tesseract-OCR\tesseract” or r”C:\Users\USA\Anaconda3\Tesseract-OCR\tesseract\tesseract.exe”

Here is the code:

pytesseract.pytesseract.tesseract_cmd = r"C:\Users\USA\Anaconda3\Tesseract-OCR\tesseract\tesseract.exe"
		
print(dir(pytesseract.pytesseract))
		
If we look at the Package Contents of pytesseract, you can see lot of different object you can discover. In this tutorial we will focus on image_to_string.

BytesIO
Image
LooseVersion
OSD_KEYS
Output
PandasNotSupported
QUOTE_NONE
RGB_MODE
TSVNotSupported
TesseractError
TesseractNotFoundError
__builtins__
__cached__
__doc__
__file__
__loader__
__name__
__package__
__spec__
cleanup
file_to_dict
find_loader
get_errors
get_pandas_output
get_tesseract_version
iglob
image_to_boxes
image_to_data
image_to_osd
image_to_pdf_or_hocr
image_to_string
is_valid
main
ndarray
normcase
normpath
numpy_installed
os
osd_to_dict
pandas_installed
pd
prepare
realpath
run_and_get_output
run_once
run_tesseract
save_image
shlex
string
subprocess
subprocess_args
sys
tempfile
tesseract_cmd
wraps

Help on image_to_string object seems quite simple and straightforward.

help(pytesseract.pytesseract.image_to_string)
		

Help on function image_to_string in module pytesseract.pytesseract:

image_to_string(image, lang=None, config=”, nice=0, output_type=’string’)
Returns the result of a Tesseract OCR run on the provided image to string

f = r'c:/Users/t/Desktop/default.png'
img = Image.open(f)
img.show()
		

ACTUAL OCR PART

We’ve opened an image with text. Let’s start doing some OCR!

text = pytesseract.image_to_string(img)
print(text)
		

Output:

Holy Python

PYTHON HOLLINESS

CONCLUSION

Yes, OCR is that simple! Thanks to Python and Pytesseract. 

OCR’s scope is deeper than this quick tutorial but this tutorial can get you started!

  • One simple technique that can be used when OCR is not very successful is to convert image to black and white using PIL library. This usually improves pytesseract’s reading abilities.
  • You will discover that image types such as: “RGB”, “RGBA”,  “RGBa”, “1”, “L” can dictate methods you can and cannot use. Sometimes you might have to do image type conversions using .convert(type).
  • Also, text on the image can blend with the image and for many reasons it can be harder to extract so there are different methods and parameters to prepare the image for pytesseract such as binarization and converting it to black and white type.

We hope this quick tutorial will be eye opening and motivating to get you started to explore incredible OCR possibilities with Python.