The CHROMA-FIT Dataset: Characterizing Human Ranges of Melanin for Increased Tone-Awareness |
2024 |
Gabriella Pangelinan, Xavier Merino, Samuel Langborgh, Kushal Vangara, Joyce Annan, Audison Beaubrun, Troy Weekes, Michael C. King, Fifth Workshop on Demographic Variations in Performance of Biometric AlgorithmsAbstract—The disparate performance of face analytics technology across demographic groups is a well-documented phenomenon. In particular, these systems tend toward lower accuracy for darker-skinned individuals. Prior research exploring this asymmetry has largely relied on discrete race categories, but such labels are increasingly deemed insufficient to describe the wide range of human phenotypical features. Skin tone is a more objective measure, but there is a dearth of reliable skin tone-related image data. Existing tone annotations are derived from the images alone, either by human reviewers or automated processes. However, without ground-truth skin tone measurements from the subjects of the images themselves, there is no way to assess the consistency or accuracy of post-hoc methods. In this work, we present CHROMA-FIT, the first publicly available dataset of face images and corresponding ground-truth skin tone measurements. Our goal is to provide a baseline for tone-labeling methods in assessing and improving their accuracy. The dataset comprises approximately 2,300 still images of 209 participants in indoor and outdoor collection environments. |
|
Who Wore It Best? And Who Paid Less? Effects of Privacy-Preserving Techniques Across Demographics |
2024 |
Xavier Merino, Michael C. King, Fifth Workshop on Demographic Variations in Performance of Biometric AlgorithmsAbstract—Face recognition technologies, widely adopted across various domains, have raised concerns related to demographic differentials in performance and the erosion of personal privacy. This study explores the potential of "cloaking"--a privacy-preserving technique subtly altering facial images at the pixel level in order to reduce recognition accuracy--in addressing these concerns. Specifically, we assess the effectiveness of the state-of-the-art Fawkes algorithm across demographic groups categorized by race (i.e., African American and Caucasian) and gender. Our findings reveal African American males as the most significant beneficiaries of this protective measure. Moreover, in terms of cost-effectiveness, the African American demographic, as a collective, enjoys greater protection with fewer visual disruptions compared to Caucasians. Nevertheless, we caution that while cloaking techniques like Fawkes bolster individual privacy, their protection may not remain absolute as recognition algorithms advance. Thus, we underscore the persistent need for prudent online data-sharing practices. |
|
Impact of Blur and Resolution on Demographic Disparities in 1-to-Many Facial Identification |
2024 |
Aman Bhatta, Gabriella Pangelinan, Michael C. King, Kevin W. Bowyer, Third Workshop on Image/Video/Audio Quality in Computer Vision and Generative AIAbstract—Most studies to date that have examined demographic variations in face recognition accuracy have analyzed 1-to-1 matching accuracy, using images that could be described as "government ID quality". This paper analyzes the accuracy of 1-to-many facial identification across demographic groups, and in the presence of blur and reduced resolution in the probe image as might occur in "surveillance camera quality" images. Cumulative match characteristic curves (CMC) are not appropriate for comparing propensity for rank-one recognition errors across demographics, and so we use three metrics for our analysis: (1) the well-known d' metric between mated and non-mated score distributions, and introduced in this work, (2) absolute score difference between thresholds in the high-similarity tail of the non-mated and the low-similarity tail of the mated distribution, and (3) distribution of (mated - non-mated rank-one scores) across the set of probe images. We find that demographic variation in 1-to-many accuracy does not entirely follow what has been observed in 1-to-1 matching accuracy. Also, different from 1-to-1 accuracy, demographic comparison of 1-to-many accuracy can be affected by different numbers of identities and images across demographics. More importantly, we show that increased blur in the probe image, or reduced resolution of the face in the probe image, can significantly increase the false positive identification rate. And we show that the demographic variation in these high blur or low resolution conditions is much larger for male / female than for African-American / Caucasian. The point that 1-to-many accuracy can potentially collapse in the context of processing "surveillance camera quality" probe images against a "government ID quality" gallery is an important one. |
|
Unmasking the Threat: Detecting Cloaking Attacks
|
2023 |
Xavier Merino, Pei Zhou, Michael C. King, 14th International Conference on Information & Communication Technology and System (ICTS)Abstract— Individuals who are concerned about their privacy may choose to safeguard their online photographs by adding |
|
Analyzing the Impact of Gender Misclassification on Face Recognition Accuracy |
2023 |
Afi Edem Edi Gbekevi, Paloma Vela Achu, Gabriella Pangelinan, Michael King, Kevin W. Bowyer, Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023Abstract—Automated face recognition technologies have been under scrutiny in recent years due to noted variations in accuracy relative to race and gender. Much of this concern was driven by widespread media reporting of high error rates for women and persons of color reported in an evaluation of commercial gender classification (" gender from face") tools. Many decried the conflation of errors observed in the task of gender classification with the task of face recognition. This motivated the question of whether images that are misclassified by a gender classification algorithm have increased error rate with face recognition algorithms. In the first experiment, we analyze the False Match Rate (FMR) of face recognition for comparisons in which one or both of the images are gender-misclassified. In the second experiment, we examine match scores of gender-misclassified images when compared to images from their labeled versus classified gender. We find that, in general, gender misclassified images are not associated with an increased FMR. For females, non-mated comparisons involving one misclassified image actually shift the resultant impostor distribution to lower similarity scores, representing improved accuracy. To our knowledge, this is the first work to analyze (1) the FMR of one-and two-misclassification error pairs and (2) non-mated match scores for misclassified images against labeled-and classified-gender categories. |
|
Analysis of Manual and Automated Skin Tone Assignments |
2022 |
K.S. Krishnapriya, Gabriella Pangelinan, Michael King, Kevin W. Bowyer, Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022Abstract—The Fitzpatrick scale is a standard tool in dermatology to classify skin types for melanin and sensitivity to sun exposure. After an in-person interview, the dermatologist would classify the person's skin type on a six-valued, light-to-dark scale. Various face image analysis researchers have recently categorized skin tone in face images on a six-valued, light-to-dark scale in order to look into questions of bias and accuracy related to skin tone. Categorization of skin tone on the basis of images rather than personal interview is not, on that basis alone, strictly speaking, on the Fitzpatrick scale. While the manual assignment of face images on a six-point, light-to-dark scale has been used by various researchers studying bias in face image analysis, to date there has been no study on the consistency and reliability of observers assigning skin type from an image. We analyze a set of manual skin type assignments from multiple observers viewing the same image set and find that there are inconsistencies between human raters. We then develop an algorithm for automated skin type assignments, which could be used in place of manual assignment by observers. Such an algorithm would allow for provision of skin tone annotations on large quantities of images beyond what could be accomplished by manual raters. To our knowledge, this is the first work to: (a) examine the consistency of manual skin tone ratings across observers, (b) document that there is substantial variation in the rating of the same image by different observers even when exemplar images are given for guidance and all images are color-corrected, and (c) compare manual versus automated skin tone ratings. We release the automated skin tone rating implementation so that other researchers may reproduce and extend the results in this paper. |
|
The Criminality From Face Illusion |
2020 |
Kevin W. Bowyer, Michael King, Walter Scheirer, Kushal Vangara,IEEE Transactions on Technology and Society,June 2020Abstract— The automatic analysis of face images can generate predictions about a person's gender, age, race, facial expression, body mass index, and various other indices and conditions. A few recent publications have claimed success in analyzing an image of a person's face in order to predict the person's status as Criminal / Non-Criminal. Predicting criminality from face may initially seem similar to other facial analytics, but we argue that attempts to create a criminality-from-face algorithm are necessarily doomed to fail, that apparently promising experimental results in recent publications are an illusion resulting from inadequate experimental design, and that there is potentially a large social cost to belief in the criminality from face illusion. |
|
Issues Related to Face Recognition Accuracy Varying Based on Race and Skin Tone |
2020 |
K. S. Krishnapriya,Vítor Albiero, Kushal Vangara, Michael C. King, IEEE Transactions on Technology and Society,March 2020Abstract— Face recognition technology has recently become controversial over concerns about possible bias due to accuracy varying based on race or skin tone. We explore three important aspects of face recognition technology related to this controversy. Using two different deep convolutional neural network face matchers, we show that for a fixed decision threshold, the African-American image cohort has a higher false match rate (FMR), and the Caucasian cohort has a higher false nonmatch rate. We present an analysis of the impostor distribution designed to test the premise that darker skin tone causes a higher FMR, and find no clear evidence to support this premise. Finally, we explore how using face recognition for one-to-many identification can have a very low false-negative identification rate and still present concerns related to the false-positive identification rate. Both the ArcFace and VGGFace2 matchers and the MORPH dataset used in our experiments are available to the research community so that others should be able to reproduce or reanalyze our results. |
|
Analysis of Gender Inequality In Face Recognition Accuracy |
2020 |
Vítor Albiero, K.S. Krishnapriya, Kushal Vangara, Kai Zhang, Michael C. King, Kevin W. Bowyer,IEEE Winter Applications of Computer Vision Workshops (WACVW),March 2020Abstract— Many recent news headlines have labeled face recognition technology as biased or racist. We report on a methodical investigation into differences in face recognition accuracy between African-American and Caucasian image cohorts of the MORPH dataset. We find that, for all four matchers considered, the impostor and the genuine distributions are statistically significantly different between cohorts. For a fixed decision threshold, the African-American image cohort has a higher false match rate and a lower false non-match rate. ROC curves compare verification rates at the same false match rate, but the different cohorts achieve the same false match rate at different thresholds. This means that ROC comparisons are not relevant to operational scenarios that use a fixed decision threshold. We show that, for the ResNet matcher, the two cohorts have approximately equal separation of impostor and genuine distributions. Using ICAO compliance as a standard of image quality, we find that the initial image cohorts have unequal rates of good quality images. The ICAO-compliant subsets of the original image cohorts show improved accuracy, with the main effect being to reducing the low-similarity tail of the genuine distributions. |
|
Does Face Recognition Accuracy Get Better With Age? Deep Face Matchers Say No |
2019 |
Vítor Albiero, Kevin W. Bowyer, Kushal Vangara, Michael C. King,IEEE/CVF Winter Conference, Nov 2019Abstract—Previous studies generally agree that face recognition accuracy is higher for older persons than for younger persons. But most previous studies were before the wave of deep learning matchers, and most considered accuracy only in terms of the verification rate for genuine pairs. This paper investigates accuracy for age groups 16-29, 30-49 and 50-70, using three modern deep CNN matchers, and considers differences in the impostor and genuine distributions as well as verification rates and ROC curves. We find that accuracy is lower for older persons and higher for younger persons. In contrast, a pre deep learning matcher on the same dataset shows the traditional result of higher accuracy for older persons, although its overall accuracy is much lower than that of the deep learning matchers. Comparing the impostor and genuine distributions, we conclude that impostor scores have a larger effect than genuine scores in causing lower accuracy for the older age group. We also investigate the effects of training data across the age groups. Our results show that fine-tuning the deep CNN models on additional images of older persons actually lowers accuracy for the older age group. Also, we fine-tune and train from scratch two models using age-balanced training datasets, and these results also show lower accuracy for older age group. These results argue that the lower accuracy for the older age group is not due to imbalance in the original training data. |
|
Characterizing the Variability in Face Recognition Accuracy Relative to Race |
2019 |
Krishnapriya K. S, Kushal Vangara, Michael C. King, Vítor Albiero, Kevin Bowyer, IEEE/CVF Conference, April , 2019Abstract—Many recent news headlines have labeled face recognition technology as "biased" or "racist". We report on a methodical investigation into differences in face recognition accuracy between African-American and Caucasian image cohorts of the MORPH dataset. We find that, for all four matchers considered, the impostor and the genuine distributions are statistically significantly different between cohorts. For a fixed decision threshold, the African-American image cohort has a higher false match rate and a lower false non-match rate. ROC curves compare verification rates at the same false match rate, but the different cohorts achieve the same false match rate at different thresholds. This means that ROC comparisons are not relevant to operational scenarios that use a fixed decision threshold. We show that, for the ResNet matcher, the two cohorts have approximately equal separation of impostor and genuine distributions. Using ICAO compliance as a standard of image quality, we find that the initial image cohorts have unequal rates of good quality images. The ICAO-compliant subsets of the original image cohorts show improved accuracy, with the main effect being to reducing the low-similarity tail of the genuine distributions. |
|
Towards An Adaptable System-based Classification Design for Cyber Identity |
2018 |
Mary (Kay) Michel, Michael King, IEEE Cyber Science/SA UK 2018, Scotland, UKAbstract— As cybercrime activity continues to increase with significant data growth and the Internet of Things (IoT’s), this research introduces a new proactive methodically designed approach vs. current reactive and specialized methods. A novel holistic identity classification scheme and information architecture is proposed. This approach has an adaptive, common cybernetic trait design to support a changing technological landscape and human behavior. Common cyber identity base trait dimensions for context, physical, cyber, and human aspects allow for systematic analysis of temporal evidence to help resolve a physical person’s identity in a cybercrime. This research platform supports both broad and targeted identity analytics utilizing advanced machine learning methods with mixed media visualizations to facilitate Cyber Situational Awareness (SA). Early PhD experimentation with real-world use cases shows promise with regard to providing salient attributes and patterns of cyber activity that are unique to a person. |
|
Cyber Identity: Salient Trait Ontology and Computational Framework to Aid in Solving Cybercrime |
2018 |
Mary (Kay) Michel, Marco Carvalho, Heather Crawford, Albert C. Esterline, IEEE TrustCom 2018.Abstract— Cyber forensics is challenging due to the lack of defined holistic features with a ground truth identity core, and scalable systematic methods to credibly link a person’s physical and cyber attributes in a complex networked environment. Cybercrime continues to grow as humans conduct more online activities that generate sensitive data while connected to anyone around the world. In this work, we propose a new classification-based ontology and computational framework for resolving an identity based on cyber activities. Our ontology and framework extend legal case situational theory research to temporally map cyber and physical categorical traits. Initial experimentation based on real-world legal cases reveals contextual salient traits that are most effective in linking evidence to a person’s profile or unique identity. As a result, these multi-dimensional traits support innovative visualizations that depict a person’s linkable identity core, digital artifacts, security, and technology. The impact of our ontology and framework design is to support solving cybercrime by aiding in identity resolution. |
|
Categorization of Discoverable Cyber Attributes for Identity Protection, Privacy, and Analytics |
2018 |
Mary (Kay) Michel, Michael King, IEEE SoutheastCon, April 19, 2018. |
|
Abstract— The Internet has become a major source of data that many regard as personally identifiable, and the ease of accessibility may be considered as an invasion of privacy. While there are certainly benign uses of this data, it has also facilitated increases in identity theft and identity fraud. This paper presents classes of cyber identity attributes that can aid in analysis and protection of a person’s sensitive data in a complex, changing environment. Our research is motivated by the need to understand and organize identity attributes in such a way as to inform the general public of what Personally Identifiable Information (PII) is available and how it may be better protected. In this paper, we outline and discuss five major categories of identity attributes that are discoverable online. The categories discussed relate to data that is biographic, behavioral, relationship, biometric, and physiological data which are all part of a holistic representational model for identity analytics. |
|
Cyber Biometrics: The Face and Text Profiler |
2012 |
Kay Michel, Chris Moffatt, and Liam Mayron, IEEE MILCOM 2012 Classified session, October 29, 2012.Abstract— This paper presents a novel neural network text gender classification method. Also included are evaluations of existing facial recognition algorithms that can be used to provide a higher probability of correct gender identification when coupled with our text profiler algorithm. Test results are provided showing experimental trials of our neural network technique to identify if the author of an online text quote was male or female based on the most effective researched psycholinguistic text attributes. |
|
Cognitive Cyber Situational Awareness Using Virtual Worlds |
2011 |
Kay Michel, Nathan Helmick, and Liam Mayron, IEEE CogSIMA, February 22, 2011.Abstract—The use of network data visualization tools for cyber security is particularly challenging when large amounts of diverse data are displayed and continuously updated. These real-time changes can be used for context-sensitive decision making, but are impeded by a lack of expressive visualization techniques. In this work, we propose a new method for the visualization of network traffic - virtual worlds. These three-dimensional, immersive environments allow the representation of data and metrics within an expressive environment intuitive to many users. Furthermore, they provide a unique medium for users to collaborate in identifying and isolating security vulnerabilities. We provide a description of the system for situational awareness and proposed experiment involving cognitive processes of human vision, perception and action. |