Research at RIT : Improving Eye Tracing for Better Gaze Estimation
In simple terms, eye tracking is the process of measuring eye movements to find out where someone is looking at. If we are able to reliably predict where someone is looking at any moment, we can then perform interesting experiments to see how they think. This is called gaze estimation. That’s one of the reason why it has extensively been used in consumer studies and marketing.
However, the process of eye tracking can get tedious. Since a small difference in angle in the eye coordinate space subtends a large distance in the world space, accurate and precise eye tracking is essential for any meaningful insight. Even more, if we want to perform any real-time application, this turns out to be even more difficult. One of the best analytical method for eye tracking is detailed in the paper by Lech Swirski and Neil Dodgson at the University of Cambridge (2013) titled A fully-automatic, temporal approach to single camera, glint-free 3D eye model fitting. However, this method can be inaccurate at times and may not be able to deliver real-time performance.
To combat this, RIT-Eyes and RITnet were developed. RIT-Eyes is a rendering pipeline that takes in eye tracking data and renders out near-eye images that can then be use for eye-tracking applications. RITnet is real-time semantic segmentation of the eye for gaze tracking. The idea is to generate a dataset using the RIT-Eyes pipeline and use this dataset to train the RITnet network.
The initial result showed impressive result, however, there were much more improvements that could be made. As an RA, I’m working on improving the RIT-Eyes pipeline and using results to produce better renderings for training. Some of the newer features that we are working on:
- Support for binocular system: Initially, the RIT-Eyes pipeline only had support for monocular system i.e. one eye. In the newer version, we are working on developing the binocular system i.e. two eyes.
- Aesthetic improvement: The initial paper had a limited number of 3D models and textures that it could work with. In the newer version, we are working on a “universal” head model that can encompass different head and eye structures.