In computer vision, person re-identification is a vital pursuit in today’s interconnected world. It involves the challenging task of identifying individuals across different camera views, often in non-ideal conditions. However, achieving accurate re-identification models demands substantial diverse and well-labeled data. This is where the significance of data augmentation comes into play. Data augmentation techniques enhance the quality and quantity of available data, enabling models to learn robust features and adapt to various scenarios.
In the literature, various data augmentation methods are employed for person re-identification. These include random erasing, random horizontal flip, occlusion sample generation, virtual image creation with different lighting conditions, and even approaches involving generative adversarial networks (GANs). However, methods like Cutmix and mixup, which can generate high-quality images, are rarely utilized due to challenges adapting them to person re-identification’s triplet loss framework.
Recently, a research team from China published a new paper introducing a solution to incorporate the Cutmix data augmentation method into person re-identification. The authors extended the commonly used triplet loss to handle decimal similarity labels, optimizing image similarity. They also proposed Strip-Cutmix, a person re-identification-suited augmentation technique, and provided strategies for its effective application in this field.
Concretely, the paper adapts the triplet loss and cutmix to address this challenge. Cutmix involves pasting parts of one image onto another to create a new image. Though commonly used, cutmix is seldom employed in person re-identification due to the incompatibility with the decimal similarity labels it generates.
To reconcile this, the authors modify the triplet loss to accommodate decimal similarity labels, allowing the use of cutmix in tandem with the triplet loss. The modified triplet loss dynamically adjusts the optimization direction based on the target similarity. Additionally, the decision-making conditions of the triplet loss are rewritten to align with the target similarity label.
Concretely, the authors extend the triplet loss to handle decimal similarity labels, enabling cutmix in the re-identification context. Cutmix typically crops a portion of an image and pastes it onto another image to create a new combined image. However, the original triplet loss, which plays a vital role in metric learning for person re-identification, struggles with the decimal similarity labels generated by cutmix.
To overcome this challenge, the authors dynamically modify the optimization direction of the triplet loss to handle decimal labels, making it compatible with both cutmix and the original triplet loss. They also introduce Strip-Cutmix that divides images into horizontal blocks, capitalizing on the fact that similar features of individuals are often found in corresponding locations across images. This approach improves the quality of generated images and leads to better boundary conditions for the triplet loss. Strip-Cutmix differs from standard cutmix by emphasizing location-based mixing and image blocks, allowing it to obtain similarity labels between combined images.
In practical terms, the solution involves:
- Modifying the triplet loss to handle decimal labels.
- Introducing the Strip-Cutmix technique.
- Determining the optimal scheme for applying Strip-Cutmix during training.
An experimental study was carried out to evaluate the effectiveness of the proposed method. The experiments were conducted on Market-1501, DukeMTMC-ReID, and MSMT17 datasets. Mean Average Precision (mAP) and Cumulated Matching Characteristics (CMC) were used for evaluation.
The researchers elected ResNet-50 as the backbone. Results showed that the proposed method outperformed others, achieving best results with ResNet-50 and RegNetY-1.6GF backbones. In addition, the technique showed resistance to overfitting, reaching state-of-the-art performance. Overall, the method displayed consistent superiority, enhancing person re-identification tasks across datasets.
In conclusion, the article studied here introduces an approach to incorporate the cutmix data augmentation technique into person re-identification. The existing triplet loss utilized in person re-identification was extended to accommodate decimal similarity labels, ensuring compatibility while handling this new form. Furthermore, a novel concept called strip-cutmix was introduced, tailored specifically for person re-identification tasks. Investigating the optimal utilization scheme for strip-cutmix, the authors identified the most effective approach. This proposed method surpasses other convolutional neural network-based person re-identification models, delivering optimal performance within a pure convolutional network framework.
Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 28k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.
Mahmoud is a PhD researcher in machine learning. He also holds a
bachelor’s degree in physical science and a master’s degree in
telecommunications and networking systems. His current areas of
research concern computer vision, stock market prediction and deep
learning. He produced several scientific articles about person re-
identification and the study of the robustness and stability of deep
networks.