-
Notifications
You must be signed in to change notification settings - Fork 469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend ImageFolderDataset to support import of COCO detection data #2612
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2612 +/- ##
==========================================
- Coverage 81.86% 81.84% -0.02%
==========================================
Files 832 836 +4
Lines 106399 107398 +999
==========================================
+ Hits 87099 87896 +797
- Misses 19300 19502 +202 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's awesome! The implementation is rather concise 😄
Looks like it fit pretty well with the provisioned types for ImageFolderDataset
.
I have some minor comments, but otherwise great job!
I tested the code on a tiny subset of COCO and it works as expected.
/edit: sorry for the late review btw 😅 been quite busy
@@ -82,7 +85,7 @@ pub struct SegmentationMask { | |||
/// Object detection bounding box annotation. | |||
#[derive(Deserialize, Serialize, Debug, Clone, PartialEq)] | |||
pub struct BoundingBox { | |||
/// Coordinates. | |||
/// Coordinates in [x,y,width,height] format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just double checked the format, and for clarity I think I would be even more precise by stating [x_min, y_min, width, height]
since there also exists [cx, cy, width, height]
which could be confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I actually missed that, was not aware of the _min
thing at all, will update the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the comment wasn't updated with the last changes 🤔
4e76f29
to
4b32ab8
Compare
@laggui updated, I usually amend and force push to keep the history clean, hope that's OK for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually amend and force push to keep the history clean, hope that's OK for you.
Might mess up the github review UI for stuff that was already reviewed and untouched, but I don't have any issues with that. We squash & merge anyway so it will be reduced to a single commit 👍
Looks like the comment update was missed, otherwise it will be good to go!
@@ -82,7 +85,7 @@ pub struct SegmentationMask { | |||
/// Object detection bounding box annotation. | |||
#[derive(Deserialize, Serialize, Debug, Clone, PartialEq)] | |||
pub struct BoundingBox { | |||
/// Coordinates. | |||
/// Coordinates in [x,y,width,height] format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the comment wasn't updated with the last changes 🤔
COCO is a popular dataset which defines an own dataset format: https://cocodataset.org/#format-data This commit introduces a ImageFolderDataset::new_coco_detection() function, which expects a path to the JSON annotations file in COCO format and a path where the actual images are stored. Note, that while COCO also offers segmentation and pose estimation data, for now only the import of detection (bounding boxes) data is supported.
4b32ab8
to
1687e55
Compare
sorry, C&P fail 🤦🏻♂️ |
Thank you 🙏 |
Pull Request Template
Checklist
run-checks all
script has been executed*.* with wgpu errors not related to this PR
Changes
COCO is a popular dataset which defines an own dataset format: https://cocodataset.org/#format-data
This commit introduces an
ImageFolderDataset::new_coco_detection()
function, which expects a path to the JSON annotations file in COCO format and a path where the actual images are stored.Note, that while COCO also offers segmentation and pose estimation data, for now only the import of detection (bounding boxes) data is supported.
Testing
Apart from testing with actual COCO data, a small dataset containing three images and annotations in COCO format has been added to
crates/burn-dataset/tests/data
. Each image has a different amount of annotations, the test checks if the imported dataset contains the correct number of images and the expected number of annotations per image.