Many apps require the collection of personally identifiable information, making image collection and storage commonplace. Legislation recently enacted in many jurisdictions makes it difficult to acquire such data without anonymization or individual authorization.
Blurring images is a common method of traditional image anonymization. But it severely distorts the data, making it useless for other purposes. Generative models can now generate realistic faces suitable for a specific situation, which has led to the introduction of realistic anonymization. Although current approaches aim to hide a person’s identity, they only succeed in making their faces unrecognizable to primary and secondary identifiers.
Using dense pixel-to-surface correspondences derived from continuous surface embeddings (CSEs), surface-guided GANs (SG-GANs) offer a full-body anonymization GAN. However, this approach is prone to visual aberrations that degrade image quality. According to the researchers, the dataset is a modification of COCO comprising 40,000 human figures, which is why the visual quality is poor. The CSE segmentation used for anonymization also does not take into account hair or other body accessories; therefore, the anonymous person frequently “uses” them nonetheless. Also, SG-GAN fails to anonymize many people, as the CSE detector generally misses people who are off camera.
A new study from the Norwegian University of Science and Technology extends surface-guided GANs to address low visual quality and insufficient anonymization caused by inadequate segmentation. They present the Flickr Diverse Humans (FDH) dataset, a subset of the YFCC100M dataset, which contains 1.5 million photos of humans in diverse settings. They show that the higher visual quality of the human figures created results directly from the larger data set. As a second step, they offer a unique anonymization framework that uses a combination of cross-modality detections to drive segmentation and detection of human figures.
Researchers have used separate anonymizers in their framework to:
- Human figures detected by dense pose estimation
- Human figures that the CSE does not detect
- all other faces
The proposed approach uses a basic painting GAN for each class, trained with conventional GAN methods. The results of the study show that the proposed GAN can produce high-quality and diversified identities with minimal modeling adjustments tailored to the job. They applied their GAN for face anonymization on a revised Flickr Diverse Faces (FDF) dataset. Because the GAN is not based on position guidance, it can anonymize people even when pose information is hard to detect, significantly improving on previous methods of face anonymization.
The team also demonstrates that the style-based generator can use unconditional GAN techniques to locate globally semantically relevant addresses in latent GAN space. Therefore, the suggested anonymization pipeline can now accept attribute edits based on textual targeting.
DeepPrivacy2 surpasses all realistic state-of-the-art anonymization approaches in terms of image quality and anonymization guarantees. The accuracy of the DeepPrivacy2 synthesis has been verified using both qualitative and quantitative analyses. Since there is no accepted benchmark against anonymization methods, the team compares their results with the widely used facial anonymization method DeepPrivacy and those of surface-guided GANs for whole-body anonymization (SG-GANs). The FDH dataset is used to train the full-body anonymization generator, while the FDF256 dataset is used to train the facial anonymization generator; the FDF256 data set is an updated version of the FDF. In addition, they also incorporate evaluation data from Market1501, Cityscapes and COCO.
For a wide range of scenes, poses, and overlays, the results show that DeepPrivacy2 produces high-quality figurines. The hardcore full-body builder, employing no CSE, reveals that he is also required for high-quality anonymization with his unnatural arms and legs.
The team hopes their open source framework will serve as a valuable resource for organizations and individuals who need anonymization while maintaining image quality, particularly those working in the field of computer vision.
review the Paper and GitHub. All credit for this research goes to the researchers of this project. Also, don’t forget to join our 13k+ ML SubReddit, discord channel, and electronic newsletterwhere we share the latest AI research news, exciting AI projects, and more.
Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech at the Indian Institute of Technology (IIT), Bhubaneswar. She is a data science enthusiast and has a strong interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring new advances in technology and its real life application.