Prominent Face Image Datasets

March 15, 2020
Krishnapriya K S

There have been an ever-growing collection of face image datasets in the past decade and a standard test dataset is recommended for researchers to compare their results. The choice of an appropriate dataset is made based on several characteristics including the task to be performed, algorithm to be trained or tested, and the properties of datasets to which it needs to be compared. The following are the most prominent face image datasets used for evaluating face recognition technology.

Published: 2016
Images: 8.2 million
Subjects: 4,101
Source: American and British actors
Publicly available: Yes
Download at: Microsoft Celeb Dataset
Download clean version at: C-MS-Celeb

Reference: Yandong Guo, Lei Zhang, Yuxiao Hu, Xiaodong He, and Jianfeng Gao. MS-Celeb-1M: A dataset and benchmark for large scale face recognition. In European Conf. on Computer Vision (ECCV), 2016.

Megaface

Published: 2016
Images: 4.7 million
Subjects: 672,057
Source: Flickr users’ photo albums
Publicly available: Yes
Download at: Megaface

Reference: Ira Kemelmacher-Shlizerman, Steven M Seitz, Daniel Miller, and Evan Brossard. The megaface benchmark: 1 million faces for recognition at scale. In Intl. Conf. on Computer Vision and Pattern Recognition (CVPR), 2016.

VGG2

Published: 2018
Images: 3.31 million
Subjects: 9,131
Source: Google images of actors, athletes, and politicians
Publicly available: Yes
Download at: VGG2

Reference: Q. Cao, L. Shen, W. Xie, O.M. Parkhi, and A. Zisserman. Vggface2: A dataset for recognising faces across pose and age. In Intl. Conf. on Automatic Face and Gesture Recognition (FG), 2018.

VGG

Published: 2015
Images: 2.6 million
Subjects: 2,622
Source: Google images of actors, athletes, and politicians
Publicly available: Yes
Download at: VGG

Reference: O. M. Parkhi, A. Vedaldi, and A. Zisserman. Deep face recognition. In BMVC, 2015.

IMDB-Face

Published: 2018
Images: 1.7 million
Subjects: 59K
Source: Celebrities collected from movie screenshots and posters from the IMDb website
Publicly available: Yes
Download at: IMDB-Face

Reference: Fei Wang, Liren Chen, Cheng Li, Shiyao Huang, Yanjie Chen, Chen Qian, and Chen Change Loy. The devil of face recognition is in the noise. In European Conf. on Computer Vision (ECCV), 2018.

Diversity in Faces (DiF)

Published: 2019
Images: 0.97 million
Source: Users of the Flickr photo service
Publicly available: Yes
Download at: DiF

Reference: Merler, Michele, Nalini Ratha, Rogerio S. Feris, and John R. Smith. “Diversity in faces.” arXiv preprint arXiv:1901.10436(2019).

IMDB-Wiki

Published: 2018
Images: 523,051
Source: Celebrities from IMDb and Wikipedia
Publicly available: Yes
Download at: MDB-Wiki

Reference: T. Rothe, R. Timofte, and L. Van Gool. Deep expectation of real and apparent age from a single image without facial landmarks. L. Int J Comput Vis, pages 126–144, 2018.

Casia-Webface

Published: 2014
Images: 494,414
Subjects: 10,575
Source: Crawled from Internet
Publicly available: Yes
Download at: Casia-Webface

Reference: Shengcai Liao Dong Yi, Zhen Lei and Stan Z. Li. Learning face representation from scratch. In arXiv preprint, 2014.

UMDFaces

Published: 2016
Images: 367,888
Subjects: 8,277
Source: Crawled from Internet
Publicly available: Yes
Download at: UMDFaces

Reference: Ankan Bansal, Anirudh Nanduri, Carlos D Castillo, Rajeev Ranjan, and Rama Chellappa. Umdfaces: An annotated face dataset for training deep networks. arXiv preprint, 2016.

CelebA

Published: 2015
Images: 202,599
Subjects: 10,177
Source: Celebrity images
Publicly available: Yes
Download at: CelebA

Reference: Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. In IEEE Intl. Conf. on Computer Vision (ICCV), 2015.

CACD

Published: 2014
Images: 163,446
Subjects: 2,000
Source: Celebrity images
Publicly available: Yes
Download at: CACD

Reference: B. C. Chen, C. S. Chen, and W. H. Hsu. Face recognition and retrieval using cross-age reference coding with cross-age celebrity dataset. IEEE Trans. on Multimedia, 17(6):804–815, 2015.

FaceScrub

Published: 2014
Images: 106,863
Subjects: 530
Source: Public figures on the Internet
Publicly available: Yes
Download at: FaceScrub

Reference: S. Winkler H.-W. Ng. A data-driven approach to cleaning large face datasets. In ICIP, 2014.

IJB-C

Published: 2018
Images: 31,334
Subjects: 3,531
Source: Celebrities and Internet personalities
Publicly available: Yes
Download at: IJB-C

Reference: B. Maze, J. Adams, J. A. Duncan, N. Kalka, T. Miller, C. Otto, A. K. Jain, W. T. Niggel, J. Anderson, J. Cheney, and P. Grother. Iarpa janus benchmark – c: Face dataset and protocol. In Intl. Conf. on Biometrics (ICB), 2018.

IJB-B

Published: 2017
Images: 21,798
Subjects: 1,845
Source: Celebrities and Internet personalities
Publicly available: Yes
Download at: IJB-B

Reference: C. Whitelam, E. Taborsky, A. Blanton, B. Maze, J. Adams, T. Miller, N. Kalka, A. K. Jain, J. A. Duncan, K. Allen, J. Cheney, and P. Grother. Iarpa janus benchmark-b face dataset. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) Workshop, 2017.

Pubfig

Published: 2011
Images: 58,797
Subjects: 200
Source: Internet personalities
Publicly available: Yes
Download at: Pubfig

Reference: N. Kumar, A. Berg, P. N. Belhumeur, and S. Nayar. Describable visual attributes for face verification and image search. IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI), 33(10), 2011.

Morph

Published: 2006
Images: 55,134
Subjects: 13,618
Source: Public records
Publicly available: Yes
Download at: Morph

Reference: Karl Ricanek and Tamirat Tesafaye. Morph: A longitudinal image database of normal adult age-progression. In Intl. Conf. on Automatic Face and Gesture Recognition (FG), 2006.

Adience

Published: 2014
Images: 26,580
Subjects: 2,284
Source: Online image repositories
Publicly available: Yes
Download at: Adience

Reference: Eran Eidinger, Roee Enbar, and Tal Hassner. Age and gender estimation of unfiltered faces. IEEE Trans. on Information Forensics and Security, 9(12), 2014.

UTKface

Published: 2017
Images: 24,108
Source: Internet personalities
Publicly available: Yes
Download at: UTKface

Reference: Zhifei Zhang, Yang Song, and Hairong Qi. Age progression/regression by conditional adver- sarial autoencoder. In Intl. Conf. on Computer Vision and Pattern Recognition (CVPR), 2017.

AgeDB

Published: 2017
Images: 16,488
Subjects: 568
Source: Manually collected Google images
Publicly available: Yes
Download at: AgeDB

Reference: S. Moschoglou, A. Papaioannou, C. Sagonas, J. Deng, I. Kotsia, and S. Zafeiriou. Agedb: the first manually collected, in-the-wild age database. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) Workshop, Honolulu, Hawaii, 2017.

LFW(A)

Published: 2007
Images: 13,233
Subjects: 5,749
Source: Web images
Publicly available: Yes
Download at: LFW(A)

Reference: Gary B. Huang, Manu Ramesh, Tamara Berg, and Erik Learned-Miller. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst, October 2007.

LFW+

Published: 2017
Images: 15,699
Subjects: 8,000
Source: Google Images
Publicly available: Yes
Download at: LFW+

Reference: H. Han, A. K. Jain, S. Shan, and X. Chen. Heterogeneous face attribute estimation: A deep multi-task learning approach. IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI), 2017.

IJB-A

Published: 2015
Images: 5,712
Subjects: 500
Source: Celebrities and Internet personalities
Publicly available: Yes
Download at: IJB-A

Reference: B. F. Klare, B. Klein, E. Taborsky, A. Blanton, J. Cheney, K. Allen, P. Grother, A. Mah, M. Burge, and A. K. Jain. Pushing the frontiers of unconstrained face detection and recogni- tion: Iarpa janus benchmark a. In Intl. Conf. on Computer Vision and Pattern Recognition (CVPR), 2015.

PPB

Published: 2018
Images: 1,270
Subjects: 1,270
Source: Parliamentarians from three African countries (Rwanda, Senegal, and South Africa) and three European countries (Iceland, Finland, and Sweden)
Publicly available: Yes
Download at: PPB

Reference: Joy Buolamwini and Timnit Gebru. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Proceedings of the 1st Conf. on Fairness, Accountability and Transparency, 2018.

FGNet

Published: 2016
Images: 1,002
Subjects: 82
Source: Scanning photographs of subjects found in personal collections
Publicly available: Yes
Download at: FGNet

Reference: Gabriel Panis and Andreas Lanitis. An overview of research on facial aging using the fg-net aging database. IET Biometrics, 5(2):37–46, 2016.

KP's note
KP's note