Face Recognition Across Age in Harry Potter Movies
My Master's thesis on how can Harry Potter movies help us measure AI's performance across ages
This is an example of my early academic work experiences. In this project, I studied the aging effect on face recognition utilizing Harry Potter movies for my Master of Science degree in Computer Vision and Machine Learning at Istanbul Technical University (ITU). In this research direction, my worked focused on contributing to the state-of-the-art in face recognition when data is measured in the wild under challenging conditions. The work was carried out under the supervision of Makarand Tapaswi and Prof. Dr. Hazim Kemal Ekenel and published in the ACM on International Conference on Multimedia Retrieval:
In this paper we present a novel approach towards multi-modal emotion recognition on a challenging dataset AFEW’16, composed of video clips labeled with the six basic emotions plus the neutral state. After a preprocessing stage, we employ different feature extraction techniques (CNN, DSIFT on face and facial ROI, geometric and audio based) and encoded frame-based features using Fisher vector representations. Next, we leverage the properties of each modality using different fusion schemes. Apart from the early-level fusion and the decision level fusion approaches, we propose a hierarchical decision level method based on information gain principles and we optimize its parameters using genetic algorithms. The experimental results prove the suitability of our method, as we obtain 53.06% validation accuracy, surpassing by 14% the baseline of 38.81% on a challenging dataset, suitable for emotion recognition in the wild.
A Face Recognition Based Multiplayer Mobile Game Application
Ugur Demir, Esam Ghaleb, and Hazım Kemal Ekenel
In IFIP International Conference on Artificial Intelligence Applications and Innovations Mar 2014
In this paper, we present a multiplayer mobile game application that aims at enabling individuals play paintball or laser tag style games using their smartphones. In the application, face detection and recognition technologies are utilised to detect and identify the individuals, respectively. In the game, first, one of the players starts the game and invites the others to join. Once everyone joins the game, they receive a notification for the training stage, at which they need to record another player’s face for a short time. After the completion of the training stage, the players can start shooting each other, that is, direct the smartphone to another user and when the face is visible, press the shoot button on the screen. Both the shooter and the one who is shot are notified by the system after a successful hit. To realise this game in real-time, fast and robust face detection and face recognition algorithms have been employed. The face recognition performance of the system is benchmarked on the face data collected from the game, when it is played with up to ten players. It is found that the system is able to identify the players with a success rate of around or over 90% depending on the number of players in the game.
DEEP REPRESENTATION AND SCORE NORMALIZATION FOR FACE RECOGNITION UNDER MISMATCHED CONDITIONS
Esam Ghaleb, Gokhan Ozbulak, Hua Gao, and 1 more author
Face recognition under unconstrained conditions is a challenging computer vision task. Identification under mismatched conditions, for example, due to difference of view angles, illumination conditions, and image quality between galley and probe images, as in the International Challenge on Biometric Recognition-in-the-Wild (ICB-RW) 2016, poses even further challenges. In our work, to address this problem, we have employed facial image preprocessing, deep representation, and score normalization methods to develop a successful face recognition system. In the preprocessing step, we have aligned the gallery and probe face images with respect to automatically detected eye centers. We only used frontal faces as a gallery. For face representation, we have employed a state-of-the-art deep convolutional neural network model, namely the VGGFace model. For classification, we have applied a nearest neighbor classifier with correlation distance as the distance metric. As the final step, we normalized the resulting similarity score matrix, which includes the scores of all face images in the probe set against all face images in the gallery set, with z-score normalization. The proposed system has achieved 69.8 percent Rank-1 and 85.3 percent Rank-5 accuracy on the test set, which were the highest accuracies obtained in the challenge.