Abstract: This article discusses the implementation of OpenCV's dnn::NMSBoxes() function in software development, addressing environment restrictions that prevent manual function implementation.
2024-06-24 by Try Catch Debug
Using OpenCV's cv::dnn::NMSBoxes() Function for Object Detection in Software Development
OpenCV is an open-source computer vision and machine learning software library. One of its powerful features is the Deep Neural Network (DNN) module, which allows developers to use pre-trained deep learning models for object detection tasks. The Non-Maximum Suppression (NMS) algorithm is an essential step in object detection to eliminate redundant bounding boxes and select the most probable object. In this article, we will explore how to use OpenCV's cv::dnn::NMSBoxes() function for object detection in software development.
Prerequisites
Before diving into the cv::dnn::NMSBoxes() function, it is essential to have a basic understanding of the following concepts:
- Deep learning and neural networks
- Convolutional Neural Networks (CNNs) for object detection
- OpenCV and its DNN module
Object Detection Overview
Object detection is the process of identifying objects in an image or video and drawing bounding boxes around them. Modern object detection techniques involve deep learning models such as Faster R-CNN, YOLO, and SSD. These models use CNNs to extract features from images and predict the presence and location of objects.
Non-Maximum Suppression (NMS)
NMS is an algorithm used to eliminate redundant bounding boxes in object detection. When multiple bounding boxes overlap, NMS selects the most probable object by choosing the bounding box with the highest confidence score and discarding the others. The cv::dnn::NMSBoxes() function in OpenCV implements the NMS algorithm for object detection.
Using cv::dnn::NMSBoxes() Function
To use the cv::dnn::NMSBoxes() function, you need to follow these steps:
- Load the pre-trained deep learning model using the OpenCV DNN module.
- Perform object detection on the input image or video frame using the loaded model.
- Apply the cv::dnn::NMSBoxes() function to eliminate redundant bounding boxes.
Step 1: Loading the Model
First, you need to load the pre-trained deep learning model using the OpenCV DNN module. Here's an example of how to load a YOLOv3 model:
#include <opencv2/dnn.hpp>#include <opencv2/imgproc.hpp>#include <opencv2/highgui.hpp>using namespace cv;using namespace dnn;int main() {// Load the YOLOv3 modelNet net = readNet("yolov3.weights", "yolov3.cfg");// Load the class namesvector<string> classNames;ifstream classNamesFile("coco.names");string className;while (getline(classNamesFile, className)) {classNames.push\_back(className);}}
Step 2: Performing Object Detection
Next, you need to perform object detection on the input image or video frame using the loaded model. Here's an example of how to perform object detection using YOLOv3:
// Perform object detectionMat frame = imread("input.jpg");Mat blob = blobFromImage(frame, 1 / 255.0, Size(416, 416), Scalar(0, 0, 0), true, false);net.setInput(blob);Mat detections = net.forward("yolo\_82");// Process the detectionsvector<Mat> detectionsMat(detections.size[2]);for (int i = 0; i < detections.size[2]; i++) {detectionsMat[i] = detections.row(i).colRange(4, detections.cols);}vector<Rect> boxes;vector<int> classIds;vector<float> confidences;for (int i = 0; i < detectionsMat.size(); i++) {// Scan through the bounding boxes output from the networkMat scores = detectionsMat[i];for (int j = 0; j < scores.rows; j++) {float confidence = scores.at<float>(j, 4);if (confidence > 0.5) {int centerX = static\_cast<int>(scores.at<float>(j, 0) \* frame.cols);int centerY = static\_cast<int>(scores.at<float>(j, 1) \* frame.rows);int width = static\_cast<int>(scores.at<float>(j, 2) \* frame.cols);int height = static\_cast<int>(scores.at<float>(j, 3) \* frame.rows);int left = centerX - width / 2;int top = centerY - height / 2;boxes.push\_back(Rect(left, top, width, height));classIds.push\_back(i);confidences.push\_back(confidence);}}}
Step 3: Applying cv::dnn::NMSBoxes() Function
Finally, you need to apply the cv::dnn::NMSBoxes() function to eliminate redundant bounding boxes. Here's an example of how to use the cv::dnn::NMSBoxes() function:
// Apply Non-Maximum Suppression (NMS)float thresh = 0.45;vector<int> indices;NMSBoxes(boxes, confidences, thresh, 0, indices);// Draw the bounding boxes on the input imagefor (int i = 0; i < indices.size(); i++) {int idx = indices[i];Rect box = boxes[idx];Scalar color = Scalar(255, 0, 0);rectangle(frame, box, color, 2);// Draw the class name and confidence scoreint baseLine = 0;string label = format("%.2f", confidences[idx]);Size labelSize = getTextSize(label, FONT\_HERSHEY\_SIMPLEX, 0.5, 1, &baseLine);rectangle(frame, Point(box.x, box.y - labelSize.height),Point(box.x + labelSize.width, box.y),Scalar(255, 255, 255), FILLED);putText(frame, label, Point(box.x, box.y),FONT\_HERSHEY\_SIMPLEX, 0.5, Scalar(0, 0, 0));}
In this article, we explored how to use OpenCV's cv::dnn::NMSBoxes() function for object detection in software development. We covered the following key concepts:
- Deep learning and neural networks
- Convolutional Neural Networks (CNNs) for object detection
- OpenCV and its DNN module
- Non-Maximum Suppression (NMS) algorithm
- Using cv::dnn::NMSBoxes() function for object detection
References
- OpenCV: https://opencv.org/
- OpenCV DNN module: https://docs.opencv.org/master/d6/d0f/group__dnn.html
- YOLOv3: https://pjreddie.com/darknet/yolo/
- COCO dataset: https://cocodataset.org/#home
Resolving Pip Dependency Version Conflicts in Docker
Learn how to resolve pip dependency version conflicts when installing packages in a Docker environment.
Debugging Legacy Code: Add Log Single Function
Learn how to add a log single function to debug legacy code without disturbing the place codebase.
Parsing Error: Import/Export May Appear - Source Type: Module in Flutter App - Fixing Deploy Issue on Firebase Host
In this article, we will discuss how to resolve a common issue encountered during the deployment of a Flutter app on Firebase hosting, which involves a parsing error related to import/export statements.
Capturing Compressed Outbound Messages in Socket.IO: A Logging Approach
Learn how to capture and log compressed outbound messages in a long-running Socket.IO server application.
Understanding TypeScript: Incrementing the growthCount Variable
In this article, we'll explore how to increment the growthCount variable in TypeScript when new cells are added.
Connecting FLIR Lepton to Raspberry Pi 5: Python File Walkthrough
Learn how to connect your FLIR Lepton camera to Raspberry Pi 5 using Python. Follow this step-by-step guide to get started.
Handling String Value Changes in HTML with JavaScript for Button Click Events
Learn how to handle string value changes in HTML buttons using JavaScript for click events.