Handbook of Face Recognition

Stan Z. Li and Anil K. Jain. 2011. Chapter 1: Introduction. In Handbook of Face Recognition (2nd. ed.). Springer Publishing Company, Incorporated.

Chapter 1: Introduction

This book is aimed at students and practitioners of face recognition, focusing on advanced tutorials, surveys of methods, and a guide to current (2011) technology. Chapter 1 is focused on components of facial analysis, such as detection and tracking, and major technical challenges to building a system. Face recognition tasks are largely motivated by pragmatic and pratical applications, particularly targeting biometrics, security, authentication, and multimedia management. The authors discuss the advantages of face recognition over other bioemtric technologies as "natural, nonintrusive, and easy to use."

Face recognition operates in two modes: (1) face verification (authentication) or (2) face identification (recognition). Verification involves confirming a 1-to-1 match against a face image whose identity a person is claiming. Recognition involves a one-to-many match that queries against all faces in a database to determine who the face belongs to. Face recognition may also be one-to-few, like comparing a face against a pre-tailored list of potentials (e.g., suspects). The performance of facial analysis systems works best under constrained conditions - it is much less accurate in unconstrained conditions that have varied lighting, angles, expressions, occlusion, etc.

How it Works

A common pipeline for face recognition algorithms is as follows: image/video is inputted > a face is detected/tracked > face alignment processing > feature extraction on the aligned face > feature matching to a database of faces > face ID (or lack thereof).

Face alignment: "is aimed at achieving more accurate localization and at normalizing faces ... Facial components, such as eyes, nose, and mouth and facial outline, are located; based on the location points, the input face image is normalized with respect to geometrical properties, such as size and pose, using geometrical transforms or morphing. The face is usually further normalized with respect to photometrical properties such illumination and gray scale."

Feature extraction: "is performed to provide effective information that is useful for distinguishing between faces of different persons and stable with respect to the geometrical and photometrical variations."

Recognition results depend greatly on the features extracted from the face image and the classification methods used to distinguish faces. Face localization and normalization are the base of extracting the features.

Face Subspaces

Subspace analysis techniques are based on the notion that class patterns that a researcher is interested in locating (e.g., a face) reside in a subspace of the overall image space. Even a small image has a large number of pixels that can express many pattern classes (e.g., trees, houses, faces). Among these possible configurations, only a few will correspond to a face pattern. "Therefore, the original image representation is highly redundant, and the dimensionality of this representation could be greatly reduced when only the face pattern are of interest."

Eigenfaces/PCA: A small number (>40) of eigenfaces are derived from face training data by using PCA or Karhunen-Loeve transform. A single face image is represented as a feature vector (weights) of low dimensionality. "The features in such subspace provide more salient and richer information for recognition than the raw image. The use of subspace modeling techniques has significantly advanced face recognition technology." The face "manifold" accounts for the distribution of faces in an image while the nonface manifold represents everything else.

Technical Challenges

Large variability in facial appearance: This includes angle, illumination, and expression, as well as imaging parameters such as aperture, exposure time, and lens aberrations.

Highly Complex Nonlinear Manifolds: "In a linear subspace, Euclidean distance and more generally Mahalanobis distance, which are normally used for template matching, do not perform well for classifying between face and nonface manifolds and between manifolds of individuals ... This crucial fact limits the power of the linear methods to achieve highly accurate face detection and recognition."

High Dimensionality and Small Sample Size: "the number of examples per person (typically fewer than 10, even just one) available for learning the manifold is usually much smaller than the dimensionality of the image space; a system trained on so few examples may not generalize well to unseen instances of the face."

Technical Solutions

Two solutions to the technical problems faced above:
(1) feature extraction for constructing a "good" feature space where face manifolds become simpler, through two-steps of processing (normalizing faces geometrically and photometrically, and "extract features in the normalized images which are stable with respect to such variations, such as based on Gabor wavelets");
(2) "construct classification engines able to solve difficult nonlinear classification and regression problems in the feature space and to generalize better."