Publications
A collection of selected research papers and conference proceedings.
2025
Exploring ChatGPT for Face Presentation Attack Detection in Zero and Few-Shot in-Context Learning
Alain Komaty, Hatef Otroshi-Shahreza, Anjith George, Sébastien Marcel
2025 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW), 2025
This study highlights the potential of ChatGPT (specifically GPT-4o) as a competitive alternative for Face Presentation Attack Detection (PAD), outperforming several PAD models, including commercial solutions, in specific scenarios. Our results11https://gitlab.idiap.ch/bob/bob.paper.wacv2025.chatgpt.face.pad show that GPT-4o demonstrates high consistency, particularly in few-shot in-context learning, where its performance improves as more examples are provided (reference data). We also observe that detailed prompts enable the model to provide scores reliably, a behavior not observed with concise prompts. Additionally, explanation-seeking prompts slightly enhance the model's performance by improving its interpretability. Remarkably, the model exhibits emergent reasoning capabilities, correctly predicting the attack type (print or replay) with high accuracy in few-shot scenarios, despite not being explicitly instructed to classify attack types. Despite these strengths, GPT-4o faces challenges in zero-shot tasks, where its performance is limited compared to specialized PAD systems. Experiments were conducted on a subset of the SOTERIA dataset, ensuring compliance with data privacy regulations by using only data from consenting individuals. These findings underscore GPT-4o's promise in PAD applications, laying the groundwork for future research to address broader data privacy concerns and improve cross-dataset generalization.
Enhancing Domain Diversity in Synthetic Data Face Recognition with Dataset Fusion
Anjith George, Sébastien Marcel
arXiv.org, 2025
While the accuracy of face recognition systems has improved significantly in recent years, the datasets used to train these models are often collected through web crawling without the explicit consent of users, raising ethical and privacy concerns. To address this, many recent approaches have explored the use of synthetic data for training face recognition models. However, these models typically underperform compared to those trained on real-world data. A common limitation is that a single generator model is often used to create the entire synthetic dataset, leading to model-specific artifacts that may cause overfitting to the generator's inherent biases and artifacts. In this work, we propose a solution by combining two state-of-the-art synthetic face datasets generated using architecturally distinct backbones. This fusion reduces model-specific artifacts, enhances diversity in pose, lighting, and demographics, and implicitly regularizes the face recognition model by emphasizing identity-relevant features. We evaluate the performance of models trained on this combined dataset using standard face recognition benchmarks and demonstrate that our approach achieves superior performance across many of these benchmarks.
ArtFace: Towards Historical Portrait Face Identification via Model Adaptation
Francois Poh, Anjith George, Sébastien Marcel
arXiv.org, 2025
Identifying sitters in historical paintings is a key task for art historians, offering insight into their lives and how they chose to be seen. However, the process is often subjective and limited by the lack of data and stylistic variations. Automated facial recognition is capable of handling challenging conditions and can assist, but while traditional facial recognition models perform well on photographs, they struggle with paintings due to domain shift and high intra-class variation. Artistic factors such as style, skill, intent, and influence from other works further complicate recognition. In this work, we investigate the potential of foundation models to improve facial recognition in artworks. By fine-tuning foundation models and integrating their embeddings with those from conventional facial recognition networks, we demonstrate notable improvements over current state-of-the-art methods. Our results show that foundation models can bridge the gap where traditional methods are ineffective. Paper page at https://www.idiap.ch/paper/artface/
xEdgeFace: Efficient Cross-Spectral Face Recognition for Edge Devices
Anjith George, Sébastien Marcel
arXiv.org, 2025
Heterogeneous Face Recognition (HFR) addresses the challenge of matching face images across different sensing modalities, such as thermal to visible or near-infrared to visible, expanding the applicability of face recognition systems in real-world, unconstrained environments. While recent HFR methods have shown promising results, many rely on computation-intensive architectures, limiting their practicality for deployment on resource-constrained edge devices. In this work, we present a lightweight yet effective HFR framework by adapting a hybrid CNN-Transformer architecture originally designed for face recognition. Our approach enables efficient end-to-end training with minimal paired heterogeneous data while preserving strong performance on standard RGB face recognition tasks. This makes it a compelling solution for both homogeneous and heterogeneous scenarios. Extensive experiments across multiple challenging HFR and face recognition benchmarks demonstrate that our method consistently outperforms state-of-the-art approaches while maintaining a low computational overhead.
The Invisible Threat: Evaluating the Vulnerability of Cross-Spectral Face Recognition to Presentation Attacks
Anjith George, Sébastien Marcel
arXiv.org, 2025
Cross-spectral face recognition systems are designed to enhance the performance of facial recognition systems by enabling cross-modal matching under challenging operational conditions. A particularly relevant application is the matching of near-infrared (NIR) images to visible-spectrum (VIS) images, enabling the verification of individuals by comparing NIR facial captures acquired with VIS reference images. The use of NIR imaging offers several advantages, including greater robustness to illumination variations, better visibility through glasses and glare, and greater resistance to presentation attacks. Despite these claimed benefits, the robustness of NIR-based systems against presentation attacks has not been systematically studied in the literature. In this work, we conduct a comprehensive evaluation into the vulnerability of NIR-VIS cross-spectral face recognition systems to presentation attacks. Our empirical findings indicate that, although these systems exhibit a certain degree of reliability, they remain vulnerable to specific attacks, emphasizing the need for further research in this area.
EdgeDoc: Hybrid CNN-Transformer Model for Accurate Forgery Detection and Localization in ID Documents
Anjith George, Sébastien Marcel
arXiv.org, 2025
The widespread availability of tools for manipulating images and documents has made it increasingly easy to forge digital documents, posing a serious threat to Know Your Customer (KYC) processes and remote onboarding systems. Detecting such forgeries is essential to preserving the integrity and security of these services. In this work, we present EdgeDoc, a novel approach for the detection and localization of document forgeries. Our architecture combines a lightweight convolutional transformer with auxiliary noiseprint features extracted from the images, enhancing its ability to detect subtle manipulations. EdgeDoc achieved third place in the ICCV 2025 DeepID Challenge, demonstrating its competitiveness. Experimental results on the FantasyID dataset show that our method outperforms baseline approaches, highlighting its effectiveness in realworld scenarios. Project page : https://www.idiap. ch/paper/edgedoc/
2024
Knowledge Distillation for Face Recognition Using Synthetic Data With Dynamic Latent Sampling
Hatef Otroshi Shahreza, Anjith George, Sébastien Marcel
IEEE Access, 2024
State-of-the-art face recognition models are computationally expensive for mobile applications. Training lightweight face recognition models also requires large identity-labeled datasets, raising privacy and ethical concerns. Generating synthetic datasets for training is also challenging, and there is a significant gap in performance between models trained on real and synthetic face datasets. We propose a new framework (called SynthDistill) to train lightweight face recognition models by distilling the knowledge from a pretrained teacher model using synthetic data. We generate synthetic face images without identity labels, mitigating the problems in the intra-class variation generation of synthetic datasets, and dynamically sample from the intermediate latent space of a face generator network to generate new variations of the challenging images while further exploring new face images. The results on different benchmarking real face recognition datasets demonstrate the superiority of SynthDistill compared to training on previous synthetic datasets, achieving a verification accuracy of 99.52% on the LFW dataset with a lightweight network. The results also show that SynthDistill significantly narrows the gap between real and synthetic data training. The source code of our experiments is publicly available to facilitate the reproducibility of our work.
Heterogeneous Face Recognition Using Domain Invariant Units
Anjith George, Sébastien Marcel
IEEE International Conference on Acoustics, Speech, and Signal Processing, 2024
Heterogeneous Face Recognition (HFR) aims to expand the applicability of Face Recognition (FR) systems to challenging scenarios, enabling the matching of face images across different domains, such as matching thermal images to visible spectra. However, the development of HFR systems is challenging because of the significant domain gap between modalities and the lack of availability of large-scale paired multi-channel data. In this work, we leverage a pretrained face recognition model as a teacher network to learn domain-invariant network layers called Domain-Invariant Units (DIU) to reduce the domain gap. The proposed DIU can be trained effectively even with a limited amount of paired training data, in a contrastive distillation framework. This proposed approach has the potential to enhance pretrained models, making them more adaptable to a wider range of variations in data. We extensively evaluate our approach on multiple challenging benchmarks, demonstrating superior performance compared to state-of-the-art methods.
Modality Agnostic Heterogeneous Face Recognition with Switch Style Modulators
Anjith George, Sébastien Marcel
2024 IEEE International Joint Conference on Biometrics (IJCB), 2024
Heterogeneous Face Recognition (HFR) systems aim to enhance the capability of face recognition in challenging cross-modal authentication scenarios. However, the significant domain gap between the source and target modalities poses a considerable challenge for cross-domain matching. Existing literature primarily focuses on developing HFR approaches for specific pairs of face modalities, necessitating the explicit training of models for each source-target combination. In this work, we introduce a novel framework designed to train a modality-agnostic HFR method capable of handling multiple modalities during inference, all without explicit knowledge of the target modality labels. We achieve this by implementing a computationally efficient automatic routing mechanism called Switch Style Modulation Blocks (SSMB) that trains various domain expert modulators which transform the feature maps adaptively reducing the domain gap. Our proposed SSMB can be trained end-to-end and seamlessly integrated into pre-trained face recognition models, transforming them into modality-agnostic HFR models. We have performed extensive evaluations on HFR benchmark datasets to demonstrate its effectiveness. The source code and protocols will be made publicly available.
Model Pairing Using Embedding Translation for Backdoor Attack Detection on Open-Set Classification Tasks
A. Unnervik, Hatef Otroshi-Shahreza, Anjith George, Sébastien Marcel
arXiv.org, 2024
Backdoor attacks allow an attacker to embed a specific vulnerability in a machine learning algorithm, activated when an attacker-chosen pattern is presented, causing a specific misprediction. The need to identify backdoors in biometric scenarios has led us to propose a novel technique with different trade-offs. In this paper we propose to use model pairs on open-set classification tasks for detecting backdoors. Using a simple linear operation to project embeddings from a probe model's embedding space to a reference model's embedding space, we can compare both embeddings and compute a similarity score. We show that this score, can be an indicator for the presence of a backdoor despite models being of different architectures, having been trained independently and on different datasets. This technique allows for the detection of backdoors on models designed for open-set classification tasks, which is little studied in the literature. Additionally, we show that backdoors can be detected even when both models are backdoored. The source code is made available for reproducibility purposes.
Second Edition FRCSyn Challenge at CVPR 2024: Face Recognition Challenge in the Era of Synthetic Data
Ivan Deandres-Tame, Rubén Tolosana, Pietro Melzi, R. Vera-Rodríguez, Minchul Kim, C. Rathgeb, Xiaoming Liu, A. Morales, Julian Fiérrez, J. Ortega-Garcia, Zhizhou Zhong, Y. Huang, Yuxi Mi, Shouhong Ding, Shuigeng Zhou, Shuai He, Lingzhi Fu, Heng Cong, Rongyu Zhang, Zhihong Xiao, Evgeny Smirnov, Anton Pimenov, A.P. Grigorev, Denis Timoshenko, K. Asfaw, Cheng Yaw Low, Hao Liu, Chuyi Wang, Qing Zuo, Zhixiang He, Hatef Otroshi Shahreza, Anjith George, A. Unnervik, Parsa Rahimi, Sébastien Marcel, Pedro C. Neto, Marco Huber, J. Kolf, Naser Damer, Fadi Boutros, Jaime S. Cardoso, Ana F. Sequeira, A. Atzori, G. Fenu, Mirko Marras, Vitomir vStruc, Jiang Yu, Zhangjie Li, Jichun Li, Weisong Zhao, Zhen Lei, Xiangyu Zhu, Xiao-Yu Zhang, Bernardo Biesseck, Pedro Vidal, Luiz Coelho, Roger Granada, David Menotti
2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2024
Synthetic data is gaining increasing relevance for training machine learning models. This is mainly motivated due to several factors such as the lack of real data and intra-class variability, time and errors produced in manual labeling, and in some cases privacy concerns, among others. This paper presents an overview of the 2nd edition of the Face Recognition Challenge in the Era of Synthetic Data (FRCSyn) organized at CVPR 2024. FRCSyn aims to investigate the use of synthetic data in face recognition to address current technological limitations, including data privacy concerns, demographic biases, generalization to novel scenarios, and performance constraints in challenging situations such as aging, pose variations, and occlusions. Unlike the 1st edition, in which synthetic data from DCFace and GANDiffFace methods was only allowed to train face recognition systems, in this 2nd edition we propose new sub-tasks that allow participants to explore novel face generative methods. The outcomes of the 2nd FRCSyn Challenge, along with the proposed experimental protocol and benchmarking contribute significantly to the application of synthetic data to face recognition.
SDFR: Synthetic Data for Face Recognition Competition
Hatef Otroshi-Shahreza, Christophe Ecabert, Anjith George, A. Unnervik, Sébastien Marcel, Nicolò Di Domenico, Guido Borghi, Davide Maltoni, Fadi Boutros, Julia Vogel, Naser Damer, Ángela Sánchez-Pérez, Enrique Mas-Candela, Jorge Calvo-Zaragoza, Bernardo Biesseck, Pedro Vidal, Roger Granada, David Menotti, Ivan Deandres-Tame, Simone Maurizio La Cava, S. Concas, Pietro Melzi, Rubén Tolosana, R. Vera-Rodríguez, Gianpaolo Perelli, G. Orrú, G. L. Marcialis, Julian Fiérrez
IEEE International Conference on Automatic Face & Gesture Recognition, 2024
Large-scale face recognition datasets are collected by crawling the Internet and without individuals' consent, raising legal, ethical, and privacy concerns. With the recent advances in generative models, recently several works proposed generating synthetic face recognition datasets to mitigate concerns in web-crawled face recognition datasets. This paper presents the summary of the Synthetic Data for Face Recognition (SDFR) Competition held in conjunction with the 18th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2024) and established to investigate the use of synthetic data for training face recognition models. The SDFR competition was split into two tasks, allowing participants to train face recognition systems using new synthetic datasets and/or existing ones. In the first task, the face recognition backbone was fixed and the dataset size was limited, while the second task provided almost complete freedom on the model backbone, the dataset, and the training pipeline. The submitted models were trained on existing and also new synthetic datasets and used clever methods to improve training with synthetic data. The submissions were evaluated and ranked on a diverse set of seven benchmarking datasets. The paper gives an overview of the submitted face recognition models and reports achieved performance compared to baseline models trained on real and synthetic datasets. Furthermore, the evaluation of submissions is extended to bias assessment across different demography groups. Lastly, an outlook on the current state of the research in training face recognition models using synthetic data is presented, and existing problems as well as potential future directions are also discussed.
FRCSyn-onGoing: Benchmarking and comprehensive evaluation of real and synthetic data to improve face recognition systems
Pietro Melzi, Rubén Tolosana, R. Vera-Rodríguez, Minchul Kim, C. Rathgeb, Xiaoming Liu, Ivan Deandres-Tame, A. Morales, Julian Fiérrez, J. Ortega-Garcia, Weisong Zhao, Xiangyu Zhu, Zheyu Yan, Xiao-Yu Zhang, Jinlin Wu, Zhen Lei, Suvidha Tripathi, Mahak Kothari, Md Haider Zama, Debayan Deb, Bernardo Biesseck, Pedro Vidal, Roger Granada, Guilherme Fickel, Gustavo Führ, David Menotti, A. Unnervik, Anjith George, Christophe Ecabert, Hatef Otroshi-Shahreza, Parsa Rahimi, Sébastien Marcel, Ioannis Sarridis, C. Koutlis, Georgia Baltsou, Symeon Papadopoulos, Christos Diou, Nicolò Di Domenico, Guido Borghi, Lorenzo Pellegrini, Enrique Mas-Candela, Ángela Sánchez-Pérez, A. Atzori, Fadi Boutros, Naser Damer, G. Fenu, Mirko Marras
Information Fusion, 2024
No abstract available for this publication.
Vulnerability of Face age Verification to Replay Attacks
Pavel Korshunov, Anjith George, Gökhan Özbulak, Sébastien Marcel
IEEE International Conference on Acoustics, Speech, and Signal Processing, 2024
Presentation attacks on biometric systems have long created significant security risks. The increase in the adoption of age verification systems, which ensure that only age-appropriate content is consumed online, raises the question of vulnerability of such systems to replay presentation attacks. In this paper, we analyze the vulnerability of face age verification to simple replay attacks and assess whether presentation attack detection (PAD) systems created for biometrics can be effective at detecting similar attacks on age verification. We used three types of attacks captured with iPhone 12, Galaxy S9, and Huawei Mate 30 phones from iPad Pro, which replayed the images from a commonly used UTKFace dataset of faces with true age labels. We evaluated four state of the art face age verification algorithms, including simple classification, distribution-based, regression via classification, and adaptive distribution approaches. We show that these algorithms are vulnerable to the attacks, since the accuracy of age verification on replayed images is only a couple of percentage points different compared to when the original images are used, which means an age verification system cannot distinguish attacks from bona fide images. Using two state of the art presentation attack detection systems, DeepPixBiS and CDCN, trained to detect similar attacks on biometrics, we demonstrate that they struggle to detect both: the types of attacks that are possible in age verification scenario and the type of bona fide images that are commonly used. These results highlight the need for the development of age verification specific attack detection systems for age verification to become practical.
Face Reconstruction from Face Embeddings Using Adapter to a Face Foundation Model
Hatef Otroshi-Shahreza, Anjith George, Sébastien Marcel
2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2024
Face recognition systems extract embedding vectors from face images and use these embeddings to verify or identify individuals. Face reconstruction attack (also known as template inversion) refers to reconstructing face images from embeddings and using the reconstructed image to enter a face recognition system. In this paper, we propose to use a face foundation model to reconstruct face images from the embeddings of a blackbox face recognition model. The foundation model is trained with 42M images to generate face images from the facial embeddings of a fixed face recognition model. We propose to use an adapter (called Face Adapter) to translate target embeddings into the embedding space of the foundation model. The generated images are evaluated on different face recognition models and different datasets, demonstrating the effectiveness of our method to translate embeddings of different face recognition models. We also evaluate the transferability of reconstructed face images when attacking different face recognition models. Our experimental results show that our reconstructed face images outperform previous reconstruction attacks against face recognition models. Project Page
Digi2Real: Bridging the Realism Gap in Synthetic Data Face Recognition via Foundation Models
Anjith George, Sébastien Marcel
2025 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW), 2024
The accuracy of face recognition systems has improved significantly in the past few years, thanks to the large amount of data collected and advancements in neural network architectures. However, these large-scale datasets are often collected without explicit consent, raising ethical and privacy concerns. To address this, there have been propos-als to use synthetic datasets for training face recognition models. Yet, such models still rely on real data to train the generative models and generally exhibit inferior performance compared to those trained on real datasets. One of these datasets, DigiFace, uses a graphics pipeline to generate different identities and intra-class variations without using real data in model training. However, the performance of this approach is poor on face recognition bench-marks, possibly due to the lack of realism in the images generated by the graphics pipeline. In this work, we in-troduce a novel framework for realism transfer aimed at enhancing the realism of synthetically generated face images. Our method leverages the large-scale face foundation model, and we adapt the pipeline for realism enhancement. By integrating the controllable aspects of the graph-ics pipeline with our realism enhancement technique, we generate a large amount of realistic variations, combining the advantages of both approaches. Our empirical evaluations demonstrate that models trained using our enhanced dataset significantly improve the performance offace recog-nition systems over the baseline. The source code and dataset will be publicly accessible at the following link: https://www.idiap.ch/paper/digi2real
Second FRCSyn-onGoing: Winning Solutions and Post-Challenge Analysis to Improve Face Recognition with Synthetic Data
Ivan Deandres-Tame, Rubén Tolosana, Pietro Melzi, R. Vera-Rodríguez, Minchul Kim, C. Rathgeb, Xiaoming Liu, Luis F. Gomez, A. Morales, Julian Fiérrez, J. Ortega-Garcia, Zhizhou Zhong, Y. Huang, Yuxi Mi, Shouhong Ding, Shuigeng Zhou, Shuai He, Lingzhi Fu, Heng Cong, Rongyu Zhang, Zhihong Xiao, Evgeny Smirnov, Anton Pimenov, A I Grigorev, Denis Timoshenko, K. Asfaw, Cheng Yaw Low, Hao Liu, Chuyi Wang, Qing Zuo, Zhixiang He, Hatef Otroshi Shahreza, Anjith George, A. Unnervik, Parsa Rahimi, Sébastien Marcel, Pedro C. Neto, Marco Huber, J. Kolf, Naser Damer, Fadi Boutros, Jaime S. Cardoso, Ana F. Sequeira, A. Atzori, G. Fenu, Mirko Marras, Vitomir vStruc, Jiang Yu, Zhangjie Li, Jichun Li, Weisong Zhao, Zhen Lei, Xiangyu Zhu, Xiao-Yu Zhang, Bernardo Biesseck, Pedro Vidal, Luiz Coelho, Roger Granada, David Menotti
Information Fusion, 2024
Synthetic data is gaining increasing popularity for face recognition technologies, mainly due to the privacy concerns and challenges associated with obtaining real data, including diverse scenarios, quality, and demographic groups, among others. It also offers some advantages over real data, such as the large amount of data that can be generated or the ability to customize it to adapt to specific problem-solving needs. To effectively use such data, face recognition models should also be specifically designed to exploit synthetic data to its fullest potential. In order to promote the proposal of novel Generative AI methods and synthetic data, and investigate the application of synthetic data to better train face recognition systems, we introduce the 2nd FRCSyn-onGoing challenge, based on the 2nd Face Recognition Challenge in the Era of Synthetic Data (FRCSyn), originally launched at CVPR 2024. This is an ongoing challenge that provides researchers with an accessible platform to benchmark i) the proposal of novel Generative AI methods and synthetic data, and ii) novel face recognition systems that are specifically proposed to take advantage of synthetic data. We focus on exploring the use of synthetic data both individually and in combination with real data to solve current challenges in face recognition such as demographic bias, domain adaptation, and performance constraints in demanding situations, such as age disparities between training and testing, changes in the pose, or occlusions. Very interesting findings are obtained in this second edition, including a direct comparison with the first one, in which synthetic databases were restricted to DCFace and GANDiffFace.
From Modalities to Styles: Rethinking the Domain Gap in Heterogeneous Face Recognition
Anjith George, Sébastien Marcel
IEEE Transactions on Biometrics Behavior and Identity Science, 2024
Heterogeneous Face Recognition (HFR) focuses on matching faces from different domains, for instance, thermal to visible images, making Face Recognition (FR) systems more versatile for challenging scenarios. However, the domain gap between these domains and the limited large-scale datasets in the target HFR modalities make it challenging to develop robust HFR models from scratch. In our work, we view different modalities as distinct styles and propose a method to modulate feature maps of the target modality to address the domain gap. We present a new Conditional Adaptive Instance Modulation (CAIM) module that seamlessly fits into existing FR networks, turning them into HFR-ready systems. The CAIM block modulates intermediate feature maps, efficiently adapting to the style of the source modality and bridging the domain gap. Our method enables end-to-end training using a small set of paired samples. We extensively evaluate the proposed approach on various challenging HFR benchmarks, showing that it outperforms state-of-the-art methods. The source code and protocols for reproducing the findings will be made publicly available.
2023
FRCSyn Challenge at WACV 2024: Face Recognition Challenge in the Era of Synthetic Data
Pietro Melzi, Rubén Tolosana, R. Vera-Rodríguez, Minchul Kim, C. Rathgeb, Xiaoming Liu, Ivan Deandres-Tame, A. Morales, Julian Fiérrez, J. Ortega-Garcia, Weisong Zhao, Xiangyu Zhu, Zheyu Yan, Xiao-Yu Zhang, Jinlin Wu, Zhen Lei, Suvidha Tripathi, Mahak Kothari, Md Haider Zama, Debayan Deb, Bernardo Biesseck, Pedro Vidal, R. Granada, Guilherme P. Fickel, Gustavo Fuhr, D. Menotti, A. Unnervik, Anjith George, Christophe Ecabert, Hatef Otroshi Shahreza, Parsa Rahimi, Sébastien Marcel, Ioannis Sarridis, C. Koutlis, Georgia Baltsou, Symeon Papadopoulos, Christos Diou, Nicolò Di Domenico, Guido Borghi, Lorenzo Pellegrini, Enrique Mas-Candela, 'Angela S'anchez-P'erez, A. Atzori, Fadi Boutros, Naser Damer, G. Fenu, Mirko Marras
2024 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW), 2023
Despite the widespread adoption of face recognition technology around the world, and its remarkable performance on current benchmarks, there are still several challenges that must be covered in more detail. This paper offers an overview of the Face Recognition Challenge in the Era of Synthetic Data (FRCSyn) organized at WACV 2024. This is the first international challenge aiming to explore the use of synthetic data in face recognition to address existing limitations in the technology. Specifically, the FRCSyn Challenge targets concerns related to data privacy issues, demographic biases, generalization to unseen scenarios, and performance limitations in challenging scenarios, including significant age disparities between enrollment and testing, pose variations, and occlusions. The results achieved in the FRCSyn Challenge, together with the proposed benchmark, contribute significantly to the application of synthetic data to improve face recognition technology.
EdgeFace: Efficient Face Recognition Model for Edge Devices
Anjith George, Christophe Ecabert, Hatef Otroshi Shahreza, Ketan Kotwal, S. Marcel
IEEE Transactions on Biometrics Behavior and Identity Science, 2023
In this paper, we present EdgeFace - a lightweight and efficient face recognition network inspired by the hybrid architecture of EdgeNeXt. By effectively combining the strengths of both CNN and Transformer models, and a low rank linear layer, EdgeFace achieves excellent face recognition performance optimized for edge devices. The proposed EdgeFace network not only maintains low computational costs and compact storage, but also achieves high face recognition accuracy, making it suitable for deployment on edge devices. The proposed EdgeFace model achieved the top ranking among models with fewer than 2M parameters in the IJCB 2023 Efficient Face Recognition Competition. Extensive experiments on challenging benchmark face datasets demonstrate the effectiveness and efficiency of EdgeFace in comparison to state-of-the-art lightweight models and deep face recognition models. Our EdgeFace model with 1.77M parameters achieves state of the art results on LFW (99.73%), IJB-B (92.67%), and IJB-C (94.85%), outperforming other efficient models with larger computational complexities. The code to replicate the experiments will be made available publicly.
The Unconstrained Ear Recognition Challenge 2023: Maximizing Performance and Minimizing Bias*
ˇZ. Emerˇsiˇc, T. Ohki, M. Akasaka, T. Arakawa, S. Maeda, M. Okano, Y. Sato, Anjith George, S. Marcel, I. Ganapathi, S. S. Ali, S. Javed, N. Werghi, S. G. Is¸ık, E. Sarıtas¸, H. K. Ekenel, V. Hudovernik, J. Kolf, F. Boutros, N. Damer, Grishma Sharma, A. Kamboj, A. Nigam, D. Jain, G. C´amara-Ch´avez, P. Peer, V. ˇStruc
2023 IEEE International Joint Conference on Biometrics (IJCB), 2023
The paper provides a summary of the 2023 Unconstrained Ear Recognition Challenge (UERC), a benchmarking effort focused on ear recognition from images acquired in uncontrolled environments. The objective of the challenge was to evaluate the effectiveness of current ear recognition techniques on a challenging ear dataset while analyzing the techniques from two distinct aspects, i.e., verification performance and bias with respect to specific demographic factors, i.e., gender and ethnicity. Seven research groups participated in the challenge and submitted a seven distinct recognition approaches that ranged from descriptor-based methods and deep-learning models to ensemble techniques that relied on multiple data representations to maximize performance and minimize bias. A comprehensive investigation into the performance of the submitted models is presented, as well as an in-depth analysis of bias and associated performance differentials due to differences in gender and ethnicity. The results of the challenge suggest that a wide variety of models (e.g., transformers, convolutional neural networks, ensemble models) is capable of achieving competitive recognition results, but also that all of the models still exhibit considerable performance differentials with respect to both gender and ethnicity. To promote further development of unbiased and effective ear recognition models, the starter kit of UERC 2023 together with the baseline model, and training and test data is made available from: http://ears.fri.uni-lj.si/
EFaR 2023: Efficient Face Recognition Competition
J. Kolf, Fadi Boutros, Jurek Elliesen, Markus Theuerkauf, Naser Damer, Mohamad Alansari, Oussama Abdul Hay, Sara Alansari, S. Javed, N. Werghi, Klemen Grm, Vitomir vStruc, F. Alonso-Fernandez, Kevin Hernandez Diaz, J. Bigun, Anjith George, Christophe Ecabert, Hatef Otroshi Shahreza, Ketan Kotwal, S. Marcel, Iurii Medvedev, Bo-Hao Jin, Diogo Nunes, Ahmad Hassanpour, Pankaj Khatiwada, A. Toor, Bian Yang
2023 IEEE International Joint Conference on Biometrics (IJCB), 2023
This paper presents the summary of the Efficient Face Recognition Competition (EFaR) held at the 2023 International Joint Conference on Biometrics (IJCB 2023). The competition received 17 submissions from 6 different teams. To drive further development of efficient face recognition models, the submitted solutions are ranked based on a weighted score of the achieved verification accuracies on a diverse set of benchmarks, as well as the deployability given by the number of floating-point operations and model size. The evaluation of submissions is extended to bias, cross-quality, and large-scale recognition benchmarks. Overall, the paper gives an overview of the achieved performance values of the submitted solutions as well as a diverse set of baselines. The submitted solutions use small, efficient network architectures to reduce the computational cost, some solutions apply model quantization. An outlook on possible techniques that are underrepresented in current solutions is given as well.
Bridging the Gap: Heterogeneous Face Recognition with Conditional Adaptive Instance Modulation
Anjith George, S. Marcel
arXiv.org, 2023
Heterogeneous Face Recognition (HFR) aims to match face images across different domains, such as thermal and visible spectra, expanding the applicability of Face Recognition (FR) systems to challenging scenarios. However, the domain gap and limited availability of large-scale datasets in the target domain make training robust and invariant HFR models from scratch difficult. In this work, we treat different modalities as distinct styles and propose a framework to adapt feature maps, bridging the domain gap. We introduce a novel Conditional Adaptive Instance Modulation (CAIM) module that can be integrated into pre-trained FR networks, transforming them into HFR networks. The CAIM block modulates intermediate feature maps, to adapt the style of the target modality effectively bridging the domain gap. Our proposed method allows for end-to-end training with a minimal number of paired samples. We extensively evaluate our approach on multiple challenging benchmarks, demonstrating superior performance compared to state-of-the-art methods. The source code and protocols for reproducing the findings will be made publicly available.
SynthDistill: Face Recognition with Knowledge Distillation from Synthetic Data
Hatef Otroshi, Anjith George, S. Marcel
2023 IEEE International Joint Conference on Biometrics (IJCB), 2023
State-of-the-art face recognition networks are often computationally expensive and cannot be used for mobile applications. Training lightweight face recognition models also requires large identity-labeled datasets. Meanwhile, there are privacy and ethical concerns with collecting and using large face recognition datasets. While generating synthetic datasets for training face recognition models is an alternative option, it is challenging to generate synthetic data with sufficient intra-class variations. In addition, there is still a considerable gap between the performance of models trained on real and synthetic data. In this paper, we propose a new framework (named SynthDistill) to train lightweight face recognition models by distilling the knowledge of a pretrained teacher face recognition model using synthetic data. We use a pretrained face generator network to generate synthetic face images and use the synthesized images to learn a lightweight student network. We use synthetic face images without identity labels, mitigating the problems in the intra-class variation generation of synthetic datasets. Instead, we propose a novel dynamic sampling strategy from the intermediate latent space of the face generator network to include new variations of the challenging images while further exploring new face images in the training batch. The results on five different face recognition datasets demonstrate the superiority of our lightweight model compared to models trained on previous synthetic datasets, achieving a verification accuracy of 99.52% on the LFW dataset with a lightweight network. The results also show that our proposed framework significantly reduces the gap between training with real and synthetic data. The source code for replicating the experiments is publicly released.
2022
Attacking Face Recognition With T-Shirts: Database, Vulnerability Assessment, and Detection
M. Ibsen, C. Rathgeb, Fabian Brechtel, Ruben Klepp, K. Pöppelmann, Anjith George, S. Marcel, C. Busch
IEEE Access, 2022
Face recognition systems are widely deployed for biometric authentication. Despite this, it is well-known that, without any safeguards, face recognition systems are highly vulnerable to presentation attacks. In response to this security issue, several promising methods for detecting presentation attacks have been proposed which show high performance on existing benchmarks. However, an ongoing challenge is the generalization of presentation attack detection methods to unseen and new attack types. To this end, we propose a new T-shirt Face Presentation Attack (TFPA) database of 1,608 T-shirt attacks using 100 unique presentation attack instruments. In an extensive evaluation, we show that this type of attack can compromise the security of face recognition systems and that some state-of-the-art attack detection mechanisms trained on popular benchmarks fail to robustly generalize to the new attacks. Further, we propose three new methods for detecting T-shirt attack images, one which relies on the statistical differences between depth maps of bona fide images and T-shirt attacks, an anomaly detection approach trained on features only extracted from bona fide RGB images, and a fusion approach which achieves competitive detection performance.
ROBUST FACE PRESENTATION ATTACK DETECTION WITH MULTI-CHANNEL NEURAL NETWORKS
Anjith George, S. Marcel
Unknown Venue, 2022
No abstract available for this publication.
Prepended Domain Transformer: Heterogeneous Face Recognition Without Bells and Whistles
Anjith George, Amir Mohammadi, S. Marcel
IEEE Transactions on Information Forensics and Security, 2022
Heterogeneous Face Recognition (HFR) refers to matching face images captured in different domains, such as thermal to visible images (VIS), sketches to visible images, near-infrared to visible, and so on. This is particularly useful in matching visible spectrum images to images captured from other modalities. Though highly useful, HFR is challenging because of the domain gap between the source and target domain. Often, large-scale paired heterogeneous face image datasets are absent, preventing training models specifically for the heterogeneous task. In this work, we propose a surprisingly simple, yet, very effective method for matching face images across different sensing modalities. The core idea of the proposed approach is to add a novel neural network block called Prepended Domain Transformer (PDT) in front of a pre-trained face recognition (FR) model to address the domain gap. Retraining this new block with few paired samples in a contrastive learning setup was enough to achieve state-of-the-art performance in many HFR benchmarks. The PDT blocks can be retrained for several source-target combinations using the proposed general framework. The proposed approach is architecture agnostic, meaning they can be added to any pre-trained FR models. Further, the approach is modular and the new block can be trained with a minimal set of paired samples, making it much easier for practical deployment. The source code and protocols will be made available publicly.
A Comprehensive Evaluation on Multi-channel Biometric Face Presentation Attack Detection
Anjith George, David Geissbuhler, S. Marcel
arXiv.org, 2022
The vulnerability against presentation attacks is a crucial problem undermining the wide-deployment of face recognition systems. Though presentation attack detection (PAD) systems try to address this problem, the lack of generalization and robustness continues to be a major concern. Several works have shown that using multi-channel PAD systems could alleviate this vulnerability and result in more robust systems. However, there is a wide selection of channels available for a PAD system such as RGB, Near Infrared, Shortwave Infrared, Depth, and Thermal sensors. Having a lot of sensors increases the cost of the system, and therefore an understanding of the performance of different sensors against a wide variety of attacks is necessary while selecting the modalities. In this work, we perform a comprehensive study to understand the effectiveness of various imaging modalities for PAD. The studies are performed on a multi-channel PAD dataset, collected with 14 different sensing modalities considering a wide range of 2D, 3D, and partial attacks. We used the multi-channel convolutional network-based architecture, which uses pixel-wise binary supervision. The model has been evaluated with different combinations of channels, and different image qualities on a variety of challenging known and unknown attack protocols. The results reveal interesting trends and can act as pointers for sensor selection for safety-critical presentation attack detection systems. The source codes and protocols to reproduce the results are made available publicly making it possible to extend this work to other architectures.
2021
Multi-channel Face Presentation Attack Detection Using Deep Learning
Anjith George, S. Marcel
Advances in Computer Vision and Pattern Recognition, 2021
No abstract available for this publication.
Cross Modal Focal Loss for RGBD Face Anti-Spoofing
Anjith George, S. Marcel
Computer Vision and Pattern Recognition, 2021
Automatic methods for detecting presentation attacks are essential to ensure the reliable use of facial recognition technology. Most of the methods available in the literature for presentation attack detection (PAD) fails in generalizing to unseen attacks. In recent years, multi-channel methods have been proposed to improve the robustness of PAD systems. Often, only a limited amount of data is available for additional channels, which limits the effectiveness of these methods. In this work, we present a new framework for PAD that uses RGB and depth channels together with a novel loss function. The new architecture uses complementary information from the two modalities while reducing the impact of overfitting. Essentially, a cross-modal focal loss function is proposed to modulate the loss contribution of each channel as a function of the confidence of individual channels. Extensive evaluations in two publicly available datasets demonstrate the effectiveness of the proposed approach.
Face Liveness Detection Competition (LivDet-Face) - 2021
Sandip Purnapatra, Nic Smalt, Keivan Bahmani, Priyanka Das, David Yambay, Amir Mohammadi, Anjith George, T. Bourlai, S. Marcel, S. Schuckers, Meiling Fang, Naser Damer, Fadi Boutros, Arjan Kuijper, Alperen Kantarci, Basar Demir, Zafer Yildiz, Zabi Ghafoory, Hasan Dertli, H. K. Ekenel, Son Vu, V. Christophides, Liang Dashuang, Zhang Guanghao, Hao Zhanlong, Liu Junfu, Jin Yufeng, Samo Liu, Samuel Huang, Salieri Kuei, Jag Mohan Singh, Raghavendra Ramachandra
2021 IEEE International Joint Conference on Biometrics (IJCB), 2021
Liveness Detection (LivDet)-Face is an international competition series open to academia and industry. The competition’s objective is to assess and report state-of-the-art in liveness / Presentation Attack Detection (PAD) for face recognition. Impersonation and presentation of false samples to the sensors can be classified as presentation attacks and the ability for the sensors to detect such attempts is known as PAD. LivDet-Face 2021 * will be the first edition of the face liveness competition. This competition serves as an important benchmark in face presentation attack detection, offering (a) an independent assessment of the current state of the art in face PAD, and (b) a common evaluation protocol, availability of Presentation Attack Instruments (PAI) and live face image dataset through the Biometric Evaluation and Testing (BEAT) platform. The competition can be easily followed by researchers after it is closed, in a platform in which participants can compare their solutions against the LivDet-Face winners.
2020
Can Your Face Detector Do Anti-spoofing? Face Presentation Attack Detection with a Multi-Channel Face Detector
Anjith George, S. Marcel
arXiv.org, 2020
In a typical face recognition pipeline, the task of the face detector is to localize the face region. However, the face detector localizes regions that look like a face, irrespective of the liveliness of the face, which makes the entire system susceptible to presentation attacks. In this work, we try to reformulate the task of the face detector to detect real faces, thus eliminating the threat of presentation attacks. While this task could be challenging with visible spectrum images alone, we leverage the multi-channel information available from off the shelf devices (such as color, depth, and infrared channels) to design a multi-channel face detector. The proposed system can be used as a live-face detector obviating the need for a separate presentation attack detection module, making the system reliable in practice without any additional computational overhead. The main idea is to leverage a single-stage object detection framework, with a joint representation obtained from different channels for the PAD task. We have evaluated our approach in the multi-channel WMCA dataset containing a wide variety of attacks to show the effectiveness of the proposed framework.
On the Effectiveness of Vision Transformers for Zero-shot Face Anti-Spoofing
Anjith George, S. Marcel
2021 IEEE International Joint Conference on Biometrics (IJCB), 2020
The vulnerability of face recognition systems to presentation attacks has limited their application in security-critical scenarios. Automatic methods of detecting such malicious attempts are essential for the safe use of facial recognition technology. Although various methods have been suggested for detecting such attacks, most of them over-fit the training set and fail in generalizing to unseen attacks and environments. In this work, we use transfer learning from the vision transformer model for the zero-shot anti-spoofing task. The effectiveness of the proposed approach is demonstrated through experiments in publicly available datasets. The proposed approach outperforms the state-of-the-art methods in the zero-shot protocols in the HQ-WMCA and SiW-M datasets by a large margin. Besides, the model achieves a significant boost in cross-database performance as well.
Learning One Class Representations for Face Presentation Attack Detection Using Multi-Channel Convolutional Neural Networks
Anjith George, S. Marcel
IEEE Transactions on Information Forensics and Security, 2020
Face recognition has evolved as a widely used biometric modality. However, its vulnerability against presentation attacks poses a significant security threat. Though presentation attack detection (PAD) methods try to address this issue, they often fail in generalizing to unseen attacks. In this work, we propose a new framework for PAD using a one-class classifier, where the representation used is learned with a Multi-Channel Convolutional Neural Network (MCCNN). A novel loss function is introduced, which forces the network to learn a compact embedding for bonafide class while being far from the representation of attacks. A one-class Gaussian Mixture Model is used on top of these embeddings for the PAD task. The proposed framework introduces a novel approach to learn a robust PAD system from bonafide and available (known) attack classes. This is particularly important as collecting bonafide data and simpler attacks are much easier than collecting a wide variety of expensive attacks. The proposed system is evaluated on the publicly available WMCA multi-channel face PAD database, which contains a wide variety of 2D and 3D attacks. Further, we have performed experiments with MLFP and SiW-M datasets using RGB channels only. Superior performance in unseen attack protocols shows the effectiveness of the proposed approach. Software, data, and protocols to reproduce the results are made available publicly.
Deep Models and Shortwave Infrared Information to Detect Face Presentation Attacks
G. Heusch, Anjith George, David Geissbuhler, Z. Mostaani, S. Marcel
IEEE Transactions on Biometrics Behavior and Identity Science, 2020
This paper addresses the problem of face presentation attack detection using different image modalities. In particular, the usage of short wave infrared (SWIR) imaging is considered. Face presentation attack detection is performed using recent models based on Convolutional Neural Networks using only carefully selected SWIR image differences as input. Conducted experiments show superior performance over similar models acting on either color images or on a combination of different modalities (visible, NIR, thermal and depth), as well as on a SVM-based classifier acting on SWIR image differences. Experiments have been carried on a new public and freely available database, containing a wide variety of attacks. Video sequences have been recorded thanks to several sensors resulting in 14 different streams in the visible, NIR, SWIR and thermal spectra, as well as depth data. The best proposed approach is able to almost perfectly detect all impersonation attacks while ensuring low bonafide classification errors. On the other hand, obtained results show that obfuscation attacks are more difficult to detect. We hope that the proposed database will foster research on this challenging problem. Finally, all the code and instructions to reproduce presented experiments is made available to the research community.
High-Quality Wide Multi-Channel Attack (HQ-WMCA)
G. Heusch, Anjith George, David Geissbühler, Z. Mostaani, S. Marcel
Unknown Venue, 2020
No abstract available for this publication.
The High-Quality Wide Multi-Channel Attack (HQ-WMCA) database
Z. Mostaani, Anjith George, G. Heusch, David Geissbuhler, S. Marcel
arXiv.org, 2020
The High-Quality Wide Multi-Channel Attack database (HQ-WMCA) database extends the previous Wide Multi-Channel Attack database(WMCA), with more channels including color, depth, thermal, infrared (spectra), and short-wave infrared (spectra), and also a wide variety of attacks.
2019
Deep Pixel-wise Binary Supervision for Face Presentation Attack Detection
Anjith George, S. Marcel
International Conference on Biometrics, 2019
Face recognition has evolved as a prominent biometric authentication modality. However, vulnerability to presentation attacks curtails its reliable deployment. Automatic detection of presentation attacks is essential for secure use of face recognition technology in unattended scenarios. In this work, we introduce a Convolutional Neural Network (CNN) based framework for presentation attack detection, with deep pixel-wise supervision. The framework uses only frame level information making it suitable for deployment in smart devices with minimal computational and time overhead. We demonstrate the effectiveness of the proposed approach in public datasets for both intra as well as cross-dataset experiments. The proposed approach achieves an HTER of 0% in Replay Mobile dataset and an ACER of 0.42% in Protocol-1 of OULU dataset outperforming state of the art methods.
Biometric Face Presentation Attack Detection With Multi-Channel Convolutional Neural Network
Anjith George, Z. Mostaani, David Geissenbuhler, Olegs Nikisins, André Anjos, S. Marcel
IEEE Transactions on Information Forensics and Security, 2019
Face recognition is a mainstream biometric authentication method. However, the vulnerability to presentation attacks (a.k.a. spoofing) limits its usability in unsupervised applications. Even though there are many methods available for tackling presentation attacks (PA), most of them fail to detect sophisticated attacks such as silicone masks. As the quality of presentation attack instruments improves over time, achieving reliable PA detection with visual spectra alone remains very challenging. We argue that analysis in multiple channels might help to address this issue. In this context, we propose a multi-channel Convolutional Neural Network-based approach for presentation attack detection (PAD). We also introduce the new Wide Multi-Channel presentation Attack (WMCA) database for face PAD which contains a wide variety of 2D and 3D presentation attacks for both impersonation and obfuscation attacks. Data from different channels such as color, depth, near-infrared, and thermal are available to advance the research in face PAD. The proposed method was compared with feature-based approaches and found to outperform the baselines achieving an ACER of 0.3% on the introduced dataset. The database and the software to reproduce the results are made available publicly.
Image based Eye Gaze Tracking and its Applications
Anjith George
arXiv.org, 2019
Eye movements play a vital role in perceiving the world. Eye gaze can give a direct indication of the users point of attention, which can be useful in improving human-computer interaction. Gaze estimation in a non-intrusive manner can make human-computer interaction more natural. Eye tracking can be used for several applications such as fatigue detection, biometric authentication, disease diagnosis, activity recognition, alertness level estimation, gaze-contingent display, human-computer interaction, etc. Even though eye-tracking technology has been around for many decades, it has not found much use in consumer applications. The main reasons are the high cost of eye tracking hardware and lack of consumer level applications. In this work, we attempt to address these two issues. In the first part of this work, image-based algorithms are developed for gaze tracking which includes a new two-stage iris center localization algorithm. We have developed a new algorithm which works in challenging conditions such as motion blur, glint, and varying illumination levels. A person independent gaze direction classification framework using a convolutional neural network is also developed which eliminates the requirement of user-specific calibration.
In the second part of this work, we have developed two applications which can benefit from eye tracking data. A new framework for biometric identification based on eye movement parameters is developed. A framework for activity recognition, using gaze data from a head-mounted eye tracker is also developed. The information from gaze data, ego-motion, and visual features are integrated to classify the activities.
Domain Adaptation in Multi-Channel Autoencoder based Features for Robust Face Anti-Spoofing
Olegs Nikisins, Anjith George, S. Marcel
International Conference on Biometrics, 2019
While the performance of face recognition systems has improved significantly in the last decade, they are proved to be highly vulnerable to presentation attacks (spoofing). Most of the research in the field of face presentation attack detection (PAD), was focused on boosting the performance of the systems within a single database. Face PAD datasets are usually captured with RGB cameras, and have very limited number of both bona-fide samples and presentation attack instruments. Training face PAD systems on such data leads to poor performance, even in the closed-set scenario, especially when sophisticated attacks are involved. We explore two paths to boost the performance of the face PAD system against challenging attacks. First, by using multichannel (RGB, Depth and NIR) data, which is still easily accessible in a number of mass production devices. Second, we develop a novel Autoencoders + MLP based face PAD algorithm. Moreover, instead of collecting more data for training of the proposed deep architecture, the domain adaptation technique is proposed, transferring the knowledge of facial appearance from RGB to multi-channel domain. We also demonstrate, that learning the features of individual facial regions, is more discriminative than the features learned from an entire face. The proposed system is tested on a very recent publicly available multi-channel PAD database with a wide variety of presentation attacks.
2018
Recognition of Activities from Eye Gaze and Egocentric Video
Anjith George, A. Routray
arXiv.org, 2018
This paper presents a framework for recognition of human activity from egocentric video and eye tracking data obtained from a head-mounted eye tracker. Three channels of information such as eye movement, ego-motion, and visual features are combined for the classification of activities. Image features were extracted using a pre-trained convolutional neural network. Eye and ego-motion are quantized, and the windowed histograms are used as the features. The combination of features obtains better accuracy for activity classification as compared to individual features.
ESCaF: Pupil Centre Localization Algorithm with Candidate Filtering
Anjith George, A. Routray
arXiv.org, 2018
Algorithms for accurate localization of pupil centre is essential for gaze tracking in real world conditions. Most of the algorithms fail in real world conditions like illumination variations, contact lenses, glasses, eye makeup, motion blur, noise, etc. We propose a new algorithm which improves the detection rate in real world conditions. The proposed algorithm uses both edges as well as intensity information along with a candidate filtering approach to identify the best pupil candidate. A simple tracking scheme has also been added which improves the processing speed. The algorithm has been evaluated in Labelled Pupil in the Wild (LPW) dataset, largest in its class which contains real world conditions. The proposed algorithm outperformed the state of the art algorithms while achieving real-time performance.
2017
A Multimodal System for Assessing Alertness Levels Due to Cognitive Loading
Anwesha Sengupta, A. Dasgupta, Aritra Chaudhuri, Anjith George, A. Routray, Rajlakshmi Guha
IEEE transactions on neural systems and rehabilitation engineering, 2017
No abstract available for this publication.
The unconstrained ear recognition challenge
Žiga Emeršič, Dejan Štepec, V. Štruc, P. Peer, Anjith George, Adil Ahmad, E. Omar, T. Boult, Reza Safdari, Yuxiang Zhou, S. Zafeiriou, Dogucan Yaman, Fevziye Irem Eyiokur, H. K. Ekenel
2017 IEEE International Joint Conference on Biometrics (IJCB), 2017
In this paper we present the results of the Unconstrained Ear Recognition Challenge (UERC), a group benchmarking effort centered around the problem of person recognition from ear images captured in uncontrolled conditions. The goal of the challenge was to assess the performance of existing ear recognition techniques on a challenging large-scale dataset and identify open problems that need to be addressed in the future. Five groups from three continents participated in the challenge and contributed six ear recognition techniques for the evaluation, while multiple baselines were made available for the challenge by the UERC organizers. A comprehensive analysis was conducted with all participating approaches addressing essential research questions pertaining to the sensitivity of the technology to head rotation, flipping, gallery size, large-scale recognition and others. The top performer of the UERC was found to ensure robust performance on a smaller part of the dataset (with 180 subjects) regardless of image characteristics, but still exhibited a significant performance drop when the entire dataset comprising 3,704 subjects was used for testing.
2016
Real-time eye gaze direction classification using convolutional neural network
Anjith George, A. Routray
International Conference on Signal Processing and Communications, 2016
Estimation eye gaze direction is useful in various human-computer interaction tasks. Knowledge of gaze direction can give valuable information regarding users point of attention. Certain patterns of eye movements known as eye accessing cues are reported to be related to the cognitive processes in the human brain. We propose a real-time framework for the classification of eye gaze direction and estimation of eye accessing cues. In the first stage, the algorithm detects faces using a modified version of the Viola-Jones algorithm. A rough eye region is obtained using geometric relations and facial landmarks. The eye region obtained is used in the subsequent stage to classify the eye gaze direction. A convolutional neural network is employed in this work for the classification of eye gaze direction. The proposed algorithm was tested on Eye Chimera database and found to outperform state of the art methods. The computational complexity of the algorithm is very less in the testing phase. The algorithm achieved an average frame rate of 24 fps in the desktop environment.
Fast and accurate algorithm for eye localisation for gaze tracking in low-resolution images
Anjith George, A. Routray
IET Computer Vision, 2016
Iris centre (IC) localisation in low-resolution visible images is a challenging problem in computer vision community due to noise, shadows, occlusions, pose variations, eye blinks etc. This study proposes an efficient method for determining IC in low-resolution images in the visible spectrum. Even low-cost consumer-grade webcams can be used for gaze tracking without any additional hardware. A two-stage algorithm is proposed for IC localisation. The proposed method uses geometrical characteristics of the eye. In the first stage, a fast convolution-based approach is used for obtaining the coarse location of IC). The IC location is further refined in the second stage using boundary tracing and ellipse fitting. The algorithm has been evaluated in public databases such as BioID, Gi4E and is found to outperform the state-of-the-art methods.
KBOC: Keystroke biometrics OnGoing competition
A. Morales, Julian Fierrez, M. Gomez-Barrero, J. Ortega-Garcia, Roberto Daza, John V. Monaco, J. Filho, J. Canuto, Anjith George
2016 IEEE 8th International Conference on Biometrics Theory, Applications and Systems (BTAS), 2016
This paper presents the first Keystroke Biometrics Ongoing evaluation platform and a Competition (KBOC) organized to promote reproducible research and establish a baseline in person authentication using keystroke biometrics. The ongoing evaluation tool has been developed using the BEAT platform and includes keystroke sequences (fixed-text) from 300 users acquired in 4 different sessions. In addition, the results of a parallel offline competition based on the same data and evaluation protocol are presented. The results reported have achieved EERs as low as 5.32%, which represent a challenging baseline for keystroke recognition technologies to be evaluated on the new publicly available KBOC benchmark.
Alertness Monitoring System for Vehicle Drivers using Physiological Signals
Anwesha Sengupta, Anjith George, A. Dasgupta, Aritra Chaudhuri, Bibek Kabi, A. Routray
Unknown Venue, 2016
No abstract available for this publication.
A score level fusion method for eye movement biometrics
Anjith George, A. Routray
Pattern Recognition Letters, 2016
No abstract available for this publication.
2015
A Vision Based System for Monitoring the Loss of Attention in Automotive Drivers
A. Dasgupta, Anjith George, S. Happy, A. Routray
arXiv.org, 2015
No abstract available for this publication.
A real time facial expression classification system using Local Binary Patterns
S. Happy, Anjith George, A. Routray
International Conference on Intelligent Human Computer Interaction, 2015
Facial expression analysis is one of the popular fields of research in human computer interaction (HCI). It has several applications in next generation user interfaces, human emotion analysis, behavior and cognitive modeling. In this paper, a facial expression classification algorithm is proposed which uses Haar classifier for face detection purpose, Local Binary Patterns(LBP) histogram of different block sizes of a face image as feature vectors and classifies various facial expressions using Principal Component Analysis (PCA). The algorithm is implemented in real time for expression classification since the computational complexity of the algorithm is small. A customizable approach is proposed for facial expression analysis, since the various expressions and intensity of expressions vary from person to person. The system uses grayscale frontal face images of a person to classify six basic emotions namely happiness, sadness, disgust, fear, surprise and anger.
An improved algorithm for eye corner detection
A. Dasgupta, Anshit Mandloi, Anjith George, A. Routray
International Conference on Signal Processing and Communications, 2015
In this paper, a modified algorithm for the detection of nasal and temporal eye corners is presented. The algorithm is a modification of the Santos and Proenka Method. In the first step, we detect the face and the eyes using classifiers based on Haar-like features. We then segment out the sclera, from the detected eye region. From the segmented sclera, we segment out an approximate eyelid contour. Eye corner candidates are obtained using Harris and Stephens corner detector. We introduce a post-pruning of the Eye corner candidates to locate the eye corners, finally. The algorithm has been tested on Yale, JAFFE databases as well as our created database.
An On-board Video Database of Human Drivers
A. Dasgupta, Anjith George, S. Happy, A. Routray
arXiv.org, 2015
No abstract available for this publication.
Design and Implementation of Real-time Algorithms for Eye Tracking and PERCLOS Measurement for on board Estimation of Alertness of Drivers
Anjith George, A. Routray
arXiv.org, 2015
The alertness level of drivers can be estimated with the use of computer vision based methods. The level of fatigue can be found from the value of PERCLOS. It is the ratio of closed eye frames to the total frames processed. The main objective of the thesis is the design and implementation of real-time algorithms for measurement of PERCLOS. In this work we have developed a real-time system which is able to process the video onboard and to alarm the driver in case the driver is in alert. For accurate estimation of PERCLOS the frame rate should be greater than 4 and accuracy should be greater than 90%. For eye detection we have used mainly two approaches Haar classifier based method and Principal Component Analysis (PCA) based method for day time. During night time active Near Infra Red (NIR) illumination is used. Local Binary Pattern (LBP) histogram based method is used for the detection of eyes at night time. The accuracy rate of the algorithms was found to be more than 90% at frame rates more than 5 fps which was suitable for the application.
A Framework for Fast Face and Eye Detection
Anjith George, A. Dasgupta, A. Routray
arXiv.org, 2015
Face detection is an essential step in many computer vision applications like surveillance, tracking, medical analysis, facial expression analysis etc. Several approaches have been made in the direction of face detection. Among them, Haar-like features based method is a robust method. In spite of the robustness, Haar - like features work with some limitations. However, with some simple modifications in the algorithm, its performance can be made faster and more robust. The present work refers to the increase in speed of operation of the original algorithm by down sampling the frames and its analysis with different scale factors. It also discusses the detection of tilted faces using an affine transformation of the input image.
A Drowsiness Detection Scheme Based on Fusion of Voice and Vision Cues
A. Dasgupta, Bibek Kabi, Anjith George, S. Happy, A. Routray
arXiv.org, 2015
No abstract available for this publication.
2013
An on-board vision based system for drowsiness detection in automotive drivers
A. Dasgupta, Anjith George, S. Happy, A. Routray, Tara Shanker
International Journal of Advances in Engineering Sciences and Applied Mathematics, 2013
No abstract available for this publication.
A Vision-Based System for Monitoring the Loss of Attention in Automotive Drivers
A. Dasgupta, Anjith George, S. Happy, A. Routray
IEEE transactions on intelligent transportation systems (Print), 2013
Onboard monitoring of the alertness level of an automotive driver has been challenging to research in transportation safety and management. In this paper, we propose a robust real-time embedded platform to monitor the loss of attention of the driver during day and night driving conditions. The percentage of eye closure has been used to indicate the alertness level. In this approach, the face is detected using Haar-like features and is tracked using a Kalman filter. The eyes are detected using principal component analysis during daytime and using the block local-binary-pattern features during nighttime. Finally, the eye state is classified as open or closed using support vector machines. In-plane and off-plane rotations of the driver's face have been compensated using affine transformation and perspective transformation, respectively. Compensation in illumination variation is carried out using bihistogram equalization. The algorithm has been cross-validated using brain signals and, finally, has been implemented on a single-board computer that has an Intel Atom processor with a 1.66-GHz clock, a random access memory of 1 GB, ×86 architecture, and a Windows-embedded XP operating system. The system is found to be robust under actual driving conditions.
2012
A video database of human faces under near Infra-Red illumination for human computer interaction applications
S. Happy, A. Dasgupta, Anjith George, A. Routray
International Conference on Intelligent Human Computer Interaction, 2012
Human Computer Interaction (HCI) is an evolving area of research for coherent communication between computers and human beings. Some of the important applications of HCI as reported in literature are face detection, face pose estimation, face tracking and eye gaze estimation. Development of algorithms for these applications is an active field of research. However, availability of standard database to validate such algorithms is insufficient. This paper discusses the creation of such a database created under Near Infra-Red (NIR) illumination. NIR illumination has gained its popularity for night mode applications since prolonged exposure to Infra-Red (IR) lighting may lead to many health issues. The database contains NIR videos of 60 subjects in different head orientations and with different facial expressions, facial occlusions and illumination variation. This new database can be a very valuable resource for development and evaluation of algorithms on face detection, eye detection, head tracking, eye gaze tracking etc. in NIR lighting.
Unknown
TEAM SWITZERLAND SUBMISSION TO NIST SRE24 SPEAKER RECOGNITION EVALUATION
Amrutha Prasad, Hatef Otroshi Shahreza, Andrés Carofilis, Aref Farhadipour, Shiran Liu, S. Madikeri, Anjith George, Petr Motlícek, Sébastien Marcel, Masoumeh Chapariniya, Valeriia Perepelytsia, Teodora Vukovic, Volker Dellwo
Unknown Venue, Unknown
No abstract available for this publication.
Supplement for “Face Reconstruction from Face Embeddings using Adapters to a Face Foundation Model”
Hatef Otroshi, Anjith George, S´ebastien Marcel
Unknown Venue, Unknown
No abstract available for this publication.
Heterogeneous Face Recognition with Prepended Domain Transformers
Anjith George, Sébastien Marcel
Unknown Venue, Unknown
No abstract available for this publication.