Skip to content

Latest commit

 

History

History
346 lines (291 loc) · 15.4 KB

README.en.md

File metadata and controls

346 lines (291 loc) · 15.4 KB

English | 简体中文

🚀 TensorRT YOLO

GitHub License GitHub Release GitHub commit activity GitHub Repo stars GitHub forks

🚀 TensorRT-YOLO is an easy-to-use, extremely efficient inference deployment tool for the YOLO series designed specifically for NVIDIA devices. The project not only integrates TensorRT plugins to enhance post-processing but also utilizes CUDA kernels and CUDA graphs to accelerate inference. TensorRT-YOLO provides support for both C++ and Python inference, aiming to deliver a 📦out-of-the-box deployment experience. It covers various task scenarios such as object detection, instance segmentation, image classification, pose estimation, oriented object detection, and video analysis, meeting developers' deployment needs in multiple scenarios.

✨ Key Features

🎯 Diverse YOLO Support

  • Comprehensive Compatibility: Supports YOLOv3 to YOLOv11 series models, as well as PP-YOLOE and PP-YOLOE+, meeting diverse needs.
  • Flexible Switching: Provides simple and easy-to-use interfaces for quick switching between different YOLO versions.
  • Multi-Scenario Applications: Offers rich example codes covering Detect, Segment, Classify, Pose, OBB, and more.

🚀 Performance Optimization

  • CUDA Acceleration: Optimizes pre-processing through CUDA kernels and accelerates inference using CUDA graphs.
  • TensorRT Integration: Deeply integrates TensorRT plugins to significantly speed up post-processing and improve overall inference efficiency.
  • Multi-Context Inference: Supports multi-context parallel inference to maximize hardware resource utilization.
  • Memory Management Optimization: Adapts multi-architecture memory optimization strategies (e.g., Zero Copy mode for Jetson) to enhance memory efficiency.

🛠️ Usability

  • Out-of-the-Box: Provides comprehensive C++ and Python inference support to meet different developers' needs.
  • CLI Tools: Built-in command-line tools for quick model export and inference, improving development efficiency.
  • Docker Support: Offers one-click Docker deployment solutions to simplify environment configuration and deployment processes.
  • No Third-Party Dependencies: All functionalities are implemented using standard libraries, eliminating the need for additional dependencies and simplifying deployment.
  • Easy Deployment: Provides dynamic library compilation support for easy calling and deployment.

🌐 Compatibility

  • Multi-Platform Support: Fully compatible with various operating systems and hardware platforms, including Windows, Linux, ARM, and x86.
  • TensorRT Compatibility: Perfectly adapts to TensorRT 10.x versions, ensuring seamless integration with the latest technology ecosystem.

🔧 Flexible Configuration

  • Customizable Preprocessing Parameters: Supports flexible configuration of various preprocessing parameters, including channel swapping (SwapRB), normalization parameters, and border padding.

🔮 Documentation

💨 Quick Start

1. Prerequisites

  • CUDA: Recommended version ≥ 11.0.1
  • TensorRT: Recommended version ≥ 8.6.1
  • Operating System: Linux (x86_64 or arm) (recommended); Windows is also supported

2. Installation

3. Model Export

  • Refer to the 🔧 Model Export documentation to export an ONNX model suitable for inference in this project and build it into a TensorRT engine.

4. Inference Example

Note

ClassifyModel, DetectModel, OBBModel, SegmentModel, and PoseModel correspond to image classification (Classify), detection (Detect), oriented bounding box (OBB), segmentation (Segment), and pose estimation (Pose) models, respectively.

  • Inference using Python:

    import cv2
    from tensorrt_yolo.infer import InferOption, DetectModel, generate_labels, visualize
    
    # Configure inference options
    option = InferOption()
    option.enable_swap_rb()
    
    # Initialize the model
    model = DetectModel("yolo11n-with-plugin.engine", option)
    
    # Load an image
    im = cv2.imread("test_image.jpg")
    
    # Model prediction
    result = model.predict(im)
    print(f"==> detect result: {result}")
    
    # Visualize detection results
    labels = generate_labels("labels.txt")
    vis_im = visualize(im, result, labels)
    cv2.imwrite("vis_image.jpg", vis_im)
    
    # Clone the model and perform prediction
    clone_model = model.clone()
    clone_result = clone_model.predict(im)
    print(f"==> detect clone result: {clone_result}")
  • Inference using C++:

    #include <memory>
    #include <opencv2/opencv.hpp>
    
    // For convenience, the module uses only CUDA and TensorRT, with the rest implemented using standard libraries
    #include "deploy/model.hpp"  // Contains model inference-related class definitions
    #include "deploy/option.hpp"  // Contains inference option configuration class definitions
    #include "deploy/result.hpp"  // Contains inference result definitions
    
    int main() {
        // Configure inference options
        deploy::InferOption option;
        option.enableSwapRB();  // Enable channel swapping (from BGR to RGB)
    
        // Initialize the model
        auto model = std::make_unique<deploy::DetectModel>("yolo11n-with-plugin.engine", option);
    
        // Load an image
        cv::Mat cvim = cv::imread("test_image.jpg");
        deploy::Image im(cvim.data, cvim.cols, cvim.rows);
    
        // Model prediction
        deploy::DetResult result = model->predict(im);
    
        // Visualization (code omitted)
        // ...  // Visualization code not provided, can be implemented as needed
    
        // Clone the model and perform prediction
        auto clone_model = model->clone();
        deploy::DetResult clone_result = clone_model->predict(im);
    
        return 0;  // Program ends normally
    }

For more deployment examples, please refer to the Model Deployment Examples section.

🖥️ Model Support List

Detect Segment
Pose OBB

Symbol legend: (1) ✅ : Supported; (2) ❔: In progress; (3) ❎ : Not supported; (4) ❎ : Self-implemented export required for inference.

Task Scenario Model CLI Export Inference Deployment
Detect ultralytics/yolov3
Detect ultralytics/yolov5
Detect meituan/YOLOv6 ❎ Refer to official export tutorial
Detect WongKinYiu/yolov7 ❎ Refer to official export tutorial
Detect WongKinYiu/yolov9 ❎ Refer to official export tutorial
Detect THU-MIG/yolov10
Detect ultralytics/ultralytics
Detect PaddleDetection/PP-YOLOE+
Segment ultralytics/yolov3
Segment ultralytics/yolov5
Segment meituan/YOLOv6-seg ❎ Implement yourself referring to tensorrt_yolo/export/head.py 🟢
Segment WongKinYiu/yolov7 ❎ Implement yourself referring to tensorrt_yolo/export/head.py 🟢
Segment WongKinYiu/yolov9 ❎ Implement yourself referring to tensorrt_yolo/export/head.py 🟢
Segment ultralytics/ultralytics
Classify ultralytics/yolov3
Classify ultralytics/yolov5
Classify ultralytics/ultralytics
Pose ultralytics/ultralytics
OBB ultralytics/ultralytics

📄 License

TensorRT-YOLO is licensed under the GPL-3.0 License, an OSI-approved open-source license that is ideal for students and enthusiasts, fostering open collaboration and knowledge sharing. Please refer to the LICENSE file for more details.

Thank you for choosing TensorRT-YOLO; we encourage open collaboration and knowledge sharing, and we hope you comply with the relevant provisions of the open-source license.

📞 Contact

For bug reports and feature requests regarding TensorRT-YOLO, please visit GitHub Issues!

🙏 Thanks

Featured|HelloGitHub