Webcam Eye Tracker: Eye Tracking Video Games

Now that we have a working predictive model, we can deploy it to a simple application to test how well the eye tracker works. The plan is three-fold: Create a Predictor class that can load a trained model and make predictionsAdd a "tracking" mode to the data collector as a quick way to test the PredictorCreate a simple screen recorder that can save videos of eye tracking while playing video games Predictor class We want to create a predictor class that can handle all of the model loading and predicting. PyTorch models can be saved either as checkpoint files or…

0 Comments

Webcam Eye Tracker: Deep Learning with PyTorch

So far we have extracted webcam features and collected coordinate data. Now we can use that dataset to create our deep learning model with PyTorch. The following models and analyses were conducted in a Jupyter notebook, which can be found here. The problem we have is basically bounding box regression, but simplified to only 2 continuous output values (X-Y screen coordinate). To summarize, the data we have available to us: Possible inputsUnaligned face (3D Image)Aligned face (3D Image)Left eye (3D Image)Right eye (3D Image)Head position (2D Image)Head angle (Scalar)OutputsX screen coordinateY screen coordinate The goal is to find the most…

0 Comments

Webcam Eye Tracker: Webcam Features and Face Detection

Now that we have a general overview of the project, the first step in creating our eye tracker is getting video from the webcam. Following that, we need to perform face detection, alignment, and calculate various features from that video stream. Webcam video We'll start by creating a Detector that uses OpenCV to retrieve frames from the webcam. By itself, reading from the webcam is quite straightforward: import cv2 capture = cv2.VideoCapture(0) capture.read() However, reading webcam frames is a blocking action and can cause quite a lot of slowdown in our application, so ideally we would do that retrieval in…

0 Comments

Webcam Eye Tracker: An End-to-end Deep Learning Project

Recently, I wanted to learn PyTorch and needed to find a project to help focus my learning. I have always been interested in the idea of creating a webcam eye tracker, so that seemed like a good project for this. Eye trackers typically rely on infrared for accurate tracking, but performing the same task using purely vision techniques seemed like an interesting challenge. What follows is a series of posts on the process of creating a webcam eye tracker from scratch. As always, we should start by clarifying the main problems we're trying to address by going through this process.…

0 Comments

Colour image classification (CIFAR-10) using a CNN

As I mentioned in a previous post, a convolutional neural network (CNN) can be used to classify colour images in much the same way as grey scale classification. The way to achieve this is by utilizing the depth dimension of our input tensors and kernels. In this example I'll be using the CIFAR-10 dataset, which consists of 32x32 colour images belonging to 10 different classes. You can see a few examples of each class in the following image from the CIFAR-10 website: Although previously I've talked about the Lasagne and nolearn packages (here and here), extending those to colour images…

0 Comments

Visualizing Convolutional Neural Networks using nolearn

We previously talked about Convolutional Neural Networks (CNN) and how use them to recognize handwritten digits using Lasagne. While we can manually extract kernel parameters to visualize weights and activation maps (as discussed in the previous post), the nolearn package offers an easy way to visualize different elements of CNNs. nolearn is a wrapper around Lasagne (which itself is a wrapper around Theano), and offers some nice visualization options such as plotting occlusion maps to help diagnose the performance of a CNN model. Additionally, nolearn offers a very high level API that makes model training even simpler than with Lasagne. In…

0 Comments

Handwritten digit recognition with a CNN using Lasagne

Following my overview of Convolutional Neural Networks (CNN) in a previous post, now lets build a CNN model to 1) classify images of handwritten digits, and 2) see what is learned by this type of model. Handwritten digit recognition is the 'Hello World' example of the CNN world. I'll be using the MNIST database of handwritten digits, which you can find here. The MNIST database contains grey scale images of size 28x28 (pixels), each containing a handwritten number from 0-9 (inclusive). The goal: given a single image, how do we build a model that can accurately recognize the number that…

1 Comment

Overview of Convolutional Neural Networks (CNN)

Regular feed-forward artificial neural networks (ANN), like the type featured below, allow us to learn higher order non-linear features, which typically results in improved prediction accuracy over smaller models like logistic regression. However, artificial neural networks have a number of problems that make them less ideal for certain types of problems. For example, imagine a case where we wanted to classify images of handwritten digits. An image is just a 2D array of pixel intensity values, so a small 28x28 pixel image has a total of 784 pixels. If we wanted to classify this using an ANN, we would flatten…

0 Comments