Person Re-ID (ReID) aims to identify persons across multiple, non-overlapping cameras. The challenge of obtaining complete data sets has driven the need for data augmentation, and generative adversarial networks (GANs) are emerging as a promising solution.
Techniques such as GAN and its variant, deep convolutional generative adversarial networks (DCGAN), have been used to generate human images for data augmentation. The camera style (CamStyle) used by CycleGAN addresses the problem of different camera styles, while the pose-normalized GAN (PNGAN) focuses on capturing different poses of pedestrians. The main challenge is matching people with different camera styles. GAN-based methods typically produce unlabeled images, and while some techniques reduce differences in camera style, they can introduce noise and redundancy. The diversity of pedestrian postures in front of the cameras also presents a challenge.
A research team from China published a new paper to overcome the challenges cited above. The authors introduced an improved CycleGAN for ReID data augmentation. Their method integrates a pose constraint subnetwork, ensuring pose consistency while learning camera style and identity. They also employ multipseudoregularized label (MpRL) for semi-supervised learning, allowing dynamic assignment of label weights. Preliminary results indicate superior performance on multiple ReID data sets.
The complete system comprises two generating networks, two discriminating networks and two semantic segmentation networks. These segmentation networks are called pose constraint networks and are essential to ensure consistency in pedestrian poses in different images. In the improved CycleGAN, first, a generator is tasked with creating fake images and the discriminator evaluates the authenticity of these images. Through a continuous iterative process, the generated images are progressively refined to closely resemble real images. An important feature of this approach is the pose constraint loss, which ensures that the pose of one domain (X) aligns with the other domain (Y). This loss is calculated by measuring the pixel distance between the fake and real images.
Additionally, CycleGAN uses cyclic coherence to map generated images to their source domain, ensuring the integrity of the transformations. To improve the performance of the improved CycleGAN, a training strategy has been outlined. This strategy involves the use of image annotation tools, pre-training of specific subnetworks, and continuous optimization of the total loss function.
Finally, the paper presents the Multipseudo Regularized Labels (MpRL) method, designed to assign labels to generated images more effectively than traditional semi-supervised learning techniques. MpRL offers different weights for different training classes, allowing for more refined and precise labeling of the generated images and improving pedestrian re-identification results. This method contrasts with the LSRO strategy, which tends to provide uniform weights to all training classes, often resulting in less accurate predictions.
To evaluate the efficiency of the proposed method, the authors tested on three person re-identification (ReID) datasets: Market-1501, DukeMTMC-reID, and CUHK03-NP. These data sets face challenges such as color differences between cameras and data imbalance. Rank-n and mAP were the main evaluation metrics used. The experiment was created in Python3 with PyTorch on a robust Linux server. Initially, an improved CycleGAN network was trained for camera mismatches, followed by the ReID network. For validation, the authors performed an ablation study. The improved CycleGAN produced better rank 1 and mAP scores than the standard CycleGAN. The best hyperparameters for CycleGAN were determined experimentally. Comparisons between the LSRO and MpRL methods revealed that MpRL was superior. Incorporating several popular loss functions with MpRL had different effects on performance. The results established that using CycleGAN enhanced with the MpRL method outperformed conventional data augmentation techniques, effectively overcoming differences in camera style and improving re-identification accuracy. Comparison of the proposed method with other state-of-the-art methods further corroborated the superiority of their approach.
To conclude, the research team introduced an advanced CycleGAN for person re-identification, incorporating a pose constraint subnetwork to decrease variations in camera style. Posture constraint losses maintain posture coherence during identity learning. MpRL is used for label assignment, which improves re-identification accuracy. Evaluations of three ReID data sets confirm the effectiveness of their method. Future efforts will focus on domain variations to optimize the model for real-world scenarios.
Review the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to join. our SubReddit of more than 30,000 ml, Facebook community of more than 40,000 people, Discord channel, and Electronic newsletterwhere we share the latest news on ai research, interesting ai projects and more.
If you like our work, you’ll love our newsletter.
Mahmoud is a PhD researcher in machine learning. He also owns a
Bachelor’s degree in Physical Sciences and Master’s degree in
telecommunications systems and networks. Your current areas of
The research concerns computer vision, stock market prediction and depth.
learning. He produced several scientific articles on the relationship of people.
identification and study of the robustness and stability of depths
networks.
<!– ai CONTENT END 2 –>