Object detection is fundamental in artificial intelligence and serves as the backbone of numerous cutting-edge applications. From autonomous vehicles and surveillance systems to medical imaging and augmented reality, the ability to identify and locate objects in images and videos is transforming industries around the world. A powerful and versatile tool, the TensorFlow Object Detection API simplifies the creation of robust object detection models. By leveraging this API, developers can train custom models tailored to specific needs, significantly reducing development time and complexity.
In this guide, we will explore the step-by-step process of training an object detection model using TensorFlow, focusing on integrating data sets from Roboflow Universea rich repository of annotated data sets designed to accelerate ai development.
Learning objectives
- Learn how to configure the TensorFlow object detection API environment for efficient model training.
- Understand how to prepare and preprocess data sets for training, using the TFRecord format.
- Gain experience selecting and customizing a pre-trained object detection model for specific needs.
- Learn how to tune pipeline configuration files and tune model parameters to optimize performance.
- Master the training process, including handling checkpoints and evaluating model performance during training.
- Understand how to export the trained model for inference and deployment in real-world applications.
This article was published as part of the Data Science Blogathon.
Step-by-step implementation of object detection with TensorFlow
In this section, we'll walk you through a step-by-step implementation of object detection using TensorFlow, guiding you from setup to deployment.
Step 1: Set up the environment
The TensorFlow object detection API requires several dependencies. Start by cloning the TensorFlow model repository:
# Clone the tensorflow models repository from GitHub
!pip uninstall Cython -y # Temporary fix for "No module named 'object_detection'" error
!git clone --depth 1 https://github.com/tensorflow/models
- Uninstall Cython: This step ensures that there are no conflicts with the Cython library during installation.
- Clone the TensorFlow model repository: This repository contains the official TensorFlow models, including the object detection API.
Copy the configuration files and modify the setup.py file
# Copy setup files into models/research folder
%%bash
cd models/research/
protoc object_detection/protos/*.proto --python_out=.
#cp object_detection/packages/tf2/setup.py .
# Modify setup.py file to install the tf-models-official repository targeted at TF v2.8.0
import re
with open('/content/models/research/object_detection/packages/tf2/setup.py') as f:
s = f.read()
with open('/content/models/research/setup.py', 'w') as f:
# Set fine_tune_checkpoint path
s = re.sub('tf-models-official>=2.5.1',
'tf-models-official==2.8.0', s)
f.write(s)
Why is this necessary?
- Compilation of protocol buffers: The Object Detection API uses .proto files to define model configurations and data structures. These must be compiled into Python code to work.
- Dependency Version Compatibility: TensorFlow and its dependencies evolve. Using tf-models-official>=2.5.1 may inadvertently install an incompatible version for TensorFlow v2.8.0.
- Explicitly setting tf-models-official==2.8.0 avoids potential version conflicts and ensures stability.
Installing dependency libraries
TensorFlow models typically depend on specific library versions. Fixing the TensorFlow version ensures a smooth integration.
# Install the Object Detection API
# Need to do a temporary fix with PyYAML because Colab isn't able to install PyYAML v5.4.1
!pip install pyyaml==5.3
!pip install /content/models/research/
# Need to downgrade to TF v2.8.0 due to Colab compatibility bug with TF v2.10 (as of 10/03/22)
!pip install tensorflow==2.8.0
# Install CUDA version 11.0 (to maintain compatibility with TF v2.8.0)
!pip install tensorflow_io==0.23.1
!wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
!mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
!wget http://developer.download.nvidia.com/compute/cuda/11.0.2/local_installers/cuda-repo-ubuntu1804-11-0-local_11.0.2-450.51.05-1_amd64.deb
!dpkg -i cuda-repo-ubuntu1804-11-0-local_11.0.2-450.51.05-1_amd64.deb
!apt-key add /var/cuda-repo-ubuntu1804-11-0-local/7fa2af80.pub
!apt-get update && sudo apt-get install cuda-toolkit-11-0
!export LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64:$LD_LIBRARY_PATH
While running this block, you must restart sessions again and run this code block again to successfully install all dependencies. This will install all the dependencies successfully.
Installing a suitable version of the protobuf library to resolve dependency issues
!pip install protobuf==3.20.1
Step 2: Check the environment and facilities
To confirm that the installation works, run the following test:
# Run Model Bulider Test file, just to verify everything's working properly
!python /content/models/research/object_detection/builders/model_builder_tf2_test.py
If no errors appear, your configuration is complete. Now we have completed the configuration successfully.
Step 3: Prepare training data
For this tutorial, we will use the “Person detection” data set Roboflow Universe. Follow these steps to prepare it:
Visit the dataset page:
Fork the dataset into your workspace to make it accessible for customization.
Build a version of the data set to finalize your preprocessing settings, such as scaling and resizing.
Now, download it in TFRecord format, which is a binary format optimized for TensorFlow workflows. TFRecord stores data efficiently and allows TensorFlow to read large data sets during training with minimal overhead.
Once downloaded, place the dataset files in your Google Drive, mount your code to your drive, and upload those files to the code to use it.
from google.colab import drive
drive.mount('/content/gdrive')
train_record_fname="/content/gdrive/MyDrive/images/train/train.tfrecord"
val_record_fname="/content/gdrive/MyDrive/images/test/test.tfrecord"
label_map_pbtxt_fname="/content/gdrive/MyDrive/images/label_map.pbtxt"
Step 4: Configure Training Settings
Now it's time to set the configuration for the object detection model. For this example, we will use the efficientdet-d0 model. You can choose from other models like ssd-mobilenet-v2 or ssd-mobilenet-v2-fpnlite-320, but for this guide, we will focus on efficientdet-d0.
# Change the chosen_model variable to deploy different models available in the TF2 object detection zoo
chosen_model="efficientdet-d0"
MODELS_CONFIG = {
'ssd-mobilenet-v2': {
'model_name': 'ssd_mobilenet_v2_320x320_coco17_tpu-8',
'base_pipeline_file': 'ssd_mobilenet_v2_320x320_coco17_tpu-8.config',
'pretrained_checkpoint': 'ssd_mobilenet_v2_320x320_coco17_tpu-8.tar.gz',
},
'efficientdet-d0': {
'model_name': 'efficientdet_d0_coco17_tpu-32',
'base_pipeline_file': 'ssd_efficientdet_d0_512x512_coco17_tpu-8.config',
'pretrained_checkpoint': 'efficientdet_d0_coco17_tpu-32.tar.gz',
},
'ssd-mobilenet-v2-fpnlite-320': {
'model_name': 'ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8',
'base_pipeline_file': 'ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.config',
'pretrained_checkpoint': 'ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.tar.gz',
},
}
model_name = MODELS_CONFIG(chosen_model)('model_name')
pretrained_checkpoint = MODELS_CONFIG(chosen_model)('pretrained_checkpoint')
base_pipeline_file = MODELS_CONFIG(chosen_model)('base_pipeline_file')
Then we download the pre-trained weights and the configuration file corresponding to the chosen model:
# Create "mymodel" folder for holding pre-trained weights and configuration files
%mkdir /content/models/mymodel/
%cd /content/models/mymodel/
# Download pre-trained model weights
import tarfile
download_tar="http://download.tensorflow.org/models/object_detection/tf2/20200711/" + pretrained_checkpoint
!wget {download_tar}
tar = tarfile.open(pretrained_checkpoint)
tar.extractall()
tar.close()
# Download training configuration file for model
download_config = 'https://raw.githubusercontent.com/tensorflow/models/master/research/object_detection/configs/tf2/' + base_pipeline_file
!wget {download_config}
After this, we configure the number of steps for training and the batch size depending on the selected model:
# Set training parameters for the model
num_steps = 4000
if chosen_model == 'efficientdet-d0':
batch_size = 8
else:
batch_size = 8
You can increase and decrease num_steps and batch_size as per your requirements.
Step 5: Modify the Pipeline Configuration File
We need to customize the pipeline.config file with the paths to our dataset and model parameters. The pipeline.config file contains various settings, such as batch size, number of classes, and fine-tuning checkpoints. We make these modifications by reading the template and replacing the relevant fields:
# Set file locations and get number of classes for config file
pipeline_fname="/content/models/mymodel/" + base_pipeline_file
fine_tune_checkpoint="/content/models/mymodel/" + model_name + '/checkpoint/ckpt-0'
def get_num_classes(pbtxt_fname):
from object_detection.utils import label_map_util
label_map = label_map_util.load_labelmap(pbtxt_fname)
categories = label_map_util.convert_label_map_to_categories(
label_map, max_num_classes=90, use_display_name=True)
category_index = label_map_util.create_category_index(categories)
return len(category_index.keys())
num_classes = get_num_classes(label_map_pbtxt_fname)
print('Total classes:', num_classes)
# Create custom configuration file by writing the dataset, model checkpoint, and training parameters into the base pipeline file
import re
%cd /content/models/mymodel
print('writing custom configuration file')
with open(pipeline_fname) as f:
s = f.read()
with open('pipeline_file.config', 'w') as f:
# Set fine_tune_checkpoint path
s = re.sub('fine_tune_checkpoint: ".*?"',
'fine_tune_checkpoint: "{}"'.format(fine_tune_checkpoint), s)
# Set tfrecord files for train and test datasets
s = re.sub(
'(input_path: ".*?)(PATH_TO_BE_CONFIGURED/train)(.*?")', 'input_path: "{}"'.format(train_record_fname), s)
s = re.sub(
'(input_path: ".*?)(PATH_TO_BE_CONFIGURED/val)(.*?")', 'input_path: "{}"'.format(val_record_fname), s)
# Set label_map_path
s = re.sub(
'label_map_path: ".*?"', 'label_map_path: "{}"'.format(label_map_pbtxt_fname), s)
# Set batch_size
s = re.sub('batch_size: (0-9)+',
'batch_size: {}'.format(batch_size), s)
# Set training steps, num_steps
s = re.sub('num_steps: (0-9)+',
'num_steps: {}'.format(num_steps), s)
# Set number of classes num_classes
s = re.sub('num_classes: (0-9)+',
'num_classes: {}'.format(num_classes), s)
# Change fine-tune checkpoint type from "classification" to "detection"
s = re.sub(
'fine_tune_checkpoint_type: "classification"', 'fine_tune_checkpoint_type: "{}"'.format('detection'), s)
# If using ssd-mobilenet-v2, reduce learning rate (because it's too high in the default config file)
if chosen_model == 'ssd-mobilenet-v2':
s = re.sub('learning_rate_base: .8',
'learning_rate_base: .08', s)
s = re.sub('warmup_learning_rate: 0.13333',
'warmup_learning_rate: .026666', s)
# If using efficientdet-d0, use fixed_shape_resizer instead of keep_aspect_ratio_resizer (because it isn't supported by TFLite)
if chosen_model == 'efficientdet-d0':
s = re.sub('keep_aspect_ratio_resizer', 'fixed_shape_resizer', s)
s = re.sub('pad_to_max_dimension: true', '', s)
s = re.sub('min_dimension', 'height', s)
s = re.sub('max_dimension', 'width', s)
f.write(s)
# (Optional) Display the custom configuration file's contents
!cat /content/models/mymodel/pipeline_file.config
# Set the path to the custom config file and the directory to store training checkpoints in
pipeline_file="/content/models/mymodel/pipeline_file.config"
model_dir="/content/training/"
Step 6: Train the model
Now we can train the model using the custom pipeline configuration file. The training script will save checkpoints, which you can use to evaluate the performance of your model:
# Run training!
!python /content/models/research/object_detection/model_main_tf2.py \
--pipeline_config_path={pipeline_file} \
--model_dir={model_dir} \
--alsologtostderr \
--num_train_steps={num_steps} \
--sample_1_of_n_eval_examples=1
Step 7 – Save the trained model
Once training is complete, we export the trained model so it can be used for inference. We use the exporter_main_v2.py script to export the model:
!python /content/models/research/object_detection/exporter_main_v2.py \
--input_type image_tensor \
--pipeline_config_path {pipeline_file} \
--trained_checkpoint_dir {model_dir} \
--output_directory /content/exported_model
Finally, we compress the exported model into a zip file for easy downloading and then you can download the zip file containing your trained model:
import shutil
# Path to the exported model folder
exported_model_path="/content/exported_model"
# Path where the zip file will be saved
zip_file_path="/content/exported_model.zip"
# Create a zip file of the exported model folder
shutil.make_archive(zip_file_path.replace('.zip', ''), 'zip', exported_model_path)
# Download the zip file using Google Colab's file download utility
from google.colab import files
files.download(zip_file_path)
You can use these downloaded model files to test them on unseen images or in your applications as per your needs.
You can check this: collaboration notebook for detailed code
Conclusion
In conclusion, this guide gives you the knowledge and tools necessary to train an object detection model using the TensorFlow Object Detection API, leveraging Roboflow Universe datasets for rapid customization. By following the steps outlined, you can effectively prepare your data, configure the training process, select the right model, and tune it to meet your specific needs. Additionally, the ability to export and deploy your trained model opens up vast possibilities for real-world applications, whether in autonomous vehicles, medical imaging, or surveillance systems. This workflow allows you to build powerful, scalable object detection systems with reduced complexity and faster deployment time.
Key takeaways
- The TensorFlow Object Detection API provides a flexible framework for creating custom object detection models with pre-trained options, reducing development time and complexity.
- The TFRecord format is essential for efficient data handling, especially with large data sets in TensorFlow, allowing for fast training and minimal overhead.
- Pipeline configuration files are crucial for tuning and tuning the model to work with your specific data set and desired performance characteristics.
- Pretrained models like efficientdet-d0 and ssd-mobilenet-v2 provide solid starting points for training custom models, and each has specific strengths depending on use case and resource constraints.
- The training process involves managing parameters such as batch size, number of steps, and model checkpoints to ensure that the model learns optimally.
- Exporting the model is essential to use the trained object detection model on a real-world model that is being packaged and ready for deployment.
Frequently asked questions
A: The TensorFlow Object Detection API is a flexible, open source framework for creating, training, and deploying custom object detection models. Provides tools to fine-tune pre-trained models and create solutions tailored to specific use cases.
A: TFRecord is a binary file format optimized for TensorFlow pipelines. It enables efficient data handling, ensuring faster loading, minimal I/O overhead, and smoother training, especially with large data sets.
A: These files allow for seamless model customization by defining parameters such as dataset paths, learning rate, model architecture, and training steps to meet specific datasets and performance goals.
A: Select EfficientDet-D0 for a balance of accuracy and efficiency, ideal for edge devices, and SSD-MobileNet-V2 for lightweight and fast real-time applications such as mobile applications.
The media shown in this article is not the property of Analytics Vidhya and is used at the author's discretion.