Not having enough training data is one of the biggest problems in deep learning today.
A promising solution for computer vision tasks is the automatic generation of annotated synthetic images.
In this article, I will first give an overview of some image generation techniques for synthetic image data.
We then generated a training dataset without the need for manual annotations and used it to train a Faster R-CNN object detection model.
Finally, we test our trained model on real images.
In theory, synthetic images are perfect. It can generate an almost infinite number of images without any manual annotation effort..
Training data sets with real images and manual annotations can contain a significant amount of human labeling errors and are often unbalanced data sets with biases (for example, car images are likely taken from the side/from front and on a road).
However, synthetic images suffer from a problem called sim to real domain gap.
The gap between the sim and real domain arises from the fact that we are using synthetic training images, but we want to use our model on real-world images during deployment.
There are several different imaging techniques that attempt to bridge the mastery gap.
Cut and paste
One of the easiest ways to create synthetic training images is the cut and paste method.
As shown below, this technique requires real images from which the objects to be recognized are cut out. These objects can then be pasted onto random background images. to generate a large number of new training images.
While Georgakis et al. (2) argue that the position of these objects should be realistic to obtain better results (e.g., an object…