Skip to content

fredtrentini/people-counter

Repository files navigation

People Counter

This is a project to make a CNN able to accurately count people in videos.

Try it out on google colab

Steps

1- Read raw video data to create image dataset

Input: ./data-videos/[*video_name.avi|mp4]

Output: ./dataset/[*video_name]/[*frame.jpg]

2- Read image dataset to generate annotations

Automatically generated annotations are not perfect, consider manually optimizing them using cvat and re-running next steps. (COCO 1.0 format is used to export/import annotations)

Input: ./dataset/[*video_name]/[*frame.jpg]

Output: ./dataset-annotations/[*video_name]/[*frame_annotation.json]

3- Using the image dataset and the associated annotations, build a custom image model on top of the pretrained model to optimize detection for the dataset images domain

4- Test people detection in real time reading the current video device

Requirements

pip install -r requirements.txt

Scripts

  • test.py: Visualize/benchmark current existing annotations
    • Generate dataset and annotations if they don't exist
    • Provide feedback on how efficient the selected model is for predicting people
  • merge_cvat_annotations.py: When importing annotations from cvat, a lot of information is either lost or overwritten such as category_id, info, categories, etc, so this script solves this problem by inputting 2 annotation files and correctly merging the information
    • Input two relative filenames, which are expected to be:
      • File with the correct metadata (auto generated previously by test.py)
      • File with the correct labels (which just got manually optimized and imported on cvat)
    • Write new merged file to the current file path (overwrites existing file)
  • main.py: Main project file which does the following
    • Test the application in real-time using the current video device

Practical overview (How to use those scripts)

  • First of all download some video files and move them to data-videos folder.
  • Run test.py to build dataset, build dataset annotations and generate initial predictions, as well as visualize those predictions.
  • Create a task on cvat and import the annotations to visualize the predicted bounding boxes and manually optimize them as much as needed.
  • Download the updated annotations as "COCO 1.0" and run merge_cvat_annotations.py passing the original annotations and the downloaded annotations as arguments.
  • Use train.py to train the pretrained model.
  • Run python test.py -model yolov8s_trained to test the trained model against the labels it was trained on.
  • Run main.py to test the trained model against the current video device.

About

Project to train a CNN model to count people

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages