This is a fast eye detection and tracking program that takes the input from webcam. The program using OpenCV’s face detector for detecting the user’s face and eye. For tracking the user’s eye, it is using the template matching method.

If you look into the OpenCV samples directory, you’ll find facedetect.cpp which will detect user’s face from webcam using Viola-Jones method. The program performs very well in detecting human faces, but it runs rather slow because of the complex algorithm.

I want to take facedetect.cpp one step further for detecting and tracking user’s eye in real-time. In order to achieve high speed, the program need to perform the face and eye detection only once at the program startup. After the eye is successfully detected, an eye template is created at runtime and will be used for tracking the eye using template matching method. This will greatly increase the speed of the real-time tracking.

The skeleton of the program

Let’s take a look at the skeleton of the program.

int detectEye() {} // Function to detect user's face and eye
void trackEye() {} // Function to track user's eye given its template

int main()
    cv::VideoCapture cap(0);

    cv::Mat frame; 
    cv::Mat eye_tpl;  // The eye template
    cv::Rect eye_bb;  // The eye bounding box

    while(cv::waitKey(15) != 'q') 
        cap >> frame;
        cv::Mat gray;
        cv::cvtColor(frame, gray, CV_BGR2GRAY);

        if (eye_bb.width == 0 && eye_bb.height == 0) 
            detectEye(gray, eye_tpl, eye_bb);
            trackEye(gray, eye_tpl, eye_bb);
            cv::rectangle(frame, eye_bb, CV_RGB(0,255,0));
        cv::imshow("video", frame);
    return 0;

In the main() function, the program open video stream from webcam. Each frame is converted to grayscale to reduce processing time. If the bounding box of the eye is still empty, the program calls the detectEye() function. It is the function for detecting user’s face and eye. If the eye is successfully located, the function will return an eye template and its bounding box.

Given the bounding box is set, the program calls the trackEye() function. This function takes the current frame, eye template, and eye bounding box as the input. The eye template will be used for locating the eye in the given frame with template matching method. If the eye successfully located, the bounding box will be updated for the new location of the eye.

If somehow the eye tracking is lost, the eye bounding box will be cleared so the program will call the detectEye() function again.

Detecting user’s face and eye with Viola-Jones method

The face and eye detection is implemented in the detectEye() function. Given an image, the function tries to detect human face in it. If success, it will continue with detecting the eye. If success, it will create and returns the eye template and its bounding box.

 * Function to detect human face and the eyes from an image.
 * @param  im    The source image
 * @param  tpl   Will be filled with the eye template, if detection success.
 * @param  rect  Will be filled with the bounding box of the eye
 * @return zero=failed, nonzero=success
int detectEye(cv::Mat& im, cv::Mat& tpl, cv::Rect& rect)
    std::vector<cv::Rect> faces, eyes;
    face_cascade.detectMultiScale(im, faces, 1.1, 2, 
                                  CV_HAAR_SCALE_IMAGE, cv::Size(30,30));

    for (int i = 0; i < faces.size(); i++)
        cv::Mat face = im(faces[i]);
        eye_cascade.detectMultiScale(face, eyes, 1.1, 2, 
                                     CV_HAAR_SCALE_IMAGE, cv::Size(20,20));
        if (eyes.size())
            rect = eyes[0] + cv::Point(faces[i].x, faces[i].y);
            tpl  = im(rect);

    return eyes.size();

Tracking user’s eye with template matching

Given the eye template and its bounding box is set, this function will locate the eye in the given frame with template matching method. The template matching is performed in a search window to increase the speed. If success, the bounding box will be updated to the new location of the eye.

 * Perform template matching to search the user's eye in the given image.
 * @param   im    The source image
 * @param   tpl   The eye template
 * @param   rect  The eye bounding box, will be updated with _
 *                the new location of the eye
void trackEye(cv::Mat& im, cv::Mat& tpl, cv::Rect& rect)
    cv::Size size(rect.width * 2, rect.height * 2);
    cv::Rect window(rect + size - cv::Point(size.width/2, size.height/2));

    window &= cv::Rect(0, 0, im.cols, im.rows);

    cv::Mat dst(window.width - tpl.rows + 1, window.height - tpl.cols + 1, CV_32FC1);
    cv::matchTemplate(im(window), tpl, dst, CV_TM_SQDIFF_NORMED);

    double minval, maxval;
    cv::Point minloc, maxloc;
    cv::minMaxLoc(dst, &minval, &maxval, &minloc, &maxloc);

    if (minval <= 0.2)
        rect.x = window.x + minloc.x;
        rect.y = window.y + minloc.y;
        rect.x = rect.y = rect.width = rect.height = 0;

The result

When the program executed, the video doesn’t play smoothly because it performs the Viola-Jones method for detecting the face and the eye. But when the eye is successfully detected, the video plays smoothly.


The complete, fully working code is available on Github.