Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report resource usage #83

Merged
merged 12 commits into from
Nov 20, 2024
Merged

Report resource usage #83

merged 12 commits into from
Nov 20, 2024

Conversation

victorreijgwart
Copy link
Member

@victorreijgwart victorreijgwart commented Nov 19, 2024

Description

This PR introduces a ResourceMonitor class to measure resource usage (RAM, CPU time, wall time). Additionally, it updates the repository's pull request template to standardize the inclusion of performance and accuracy benchmarking results and make it more descriptive.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that causes existing functionality to not work as expected)
  • Other (please describe):

Detailed Summary

The primary motivation for this PR is to begin tracking performance to avoid regressions and monitor the evolution of the framework over time. While the repository's tests and CI pipelines test correctness, they do not account for efficiency changes. Additionally, while the tests ensure that wavemap's different map types and measurement integrators are lossless with respect to each other, they do not assess the overall end-to-end accuracy of our reconstruction pipelines.

To address this gap, we have set up a dedicated benchmarking server and extended the evaluation code from our RSS paper. This will allow us to benchmark different wavemap releases against each other and attach updated performance and accuracy results to future PRs.

On a practical level, the code changes in this PR:

  • Extends the repository's pull request template to be more descriptive and include benchmarking results.
  • Adds a ResourceMonitor class for measuring CPU time, wall time, and RAM usage during macro-benchmarking.
  • Reports resource usage stats after processing a dataset using rosbag_processor from wavemap_ros, to make it easier for users to run their own benchmarks.
  • Adjust wavemap config schemas to resolve false positive validation warnings caused by CLion bug IJPL-63581.

API Changes

List any changes to wavemap's APIs to help users update their code. Write "None" if there are no changes.

C++ API:

  • Additions
    • ResourceMonitor class in library/cpp/include/wavemap/core/utils/profile/resource_usage_monitor.h
  • Refactoring
    • Header wavemap/core/utils/profiler_interface.h moved to wavemap/core/utils/profile/profiler_interface.h

Python API:

  • None

ROS1 Interface:

  • None

Review Notes

We are particularly interested in feedback on the new pull request template.

Testing

Automated Tests

Tests have been added for the ResourceMonitor, and extended for the Stopwatch and ThreadPool classes.

Manual Tests

No manual tests were run for this PR, other than the benchmarks below.

Benchmarks

We conduct benchmarks on a dedicated server with the following specifications:

  • CPU: Intel i9-9900K (8 cores, 16 threads, 4.8–5GHz with turbo boost)
  • RAM: 32GB DDR4 at 3200 MHz
  • OS: Debian 12
  • API: C++ and ROS1
  • Installation: Local catkin workspace

The server is liquid-cooled, and its CPU power profile is set to performance. To ensure accurate results, no other jobs are active during benchmarks, and we allow the server to cool down between runs.

For a complete description of the evaluation procedure and metrics, please refer to the RSS paper. We will briefly summarize them in the following.

The following datasets are used:

  1. Newer College (multi-cam, cloister sequence):
    • Real Ouster OS0-128 LiDAR, poses estimated with FastLIO2 odometry
  2. Panoptic Mapping (run1):
    • Simulated depth camera with noise-free data, accurate ground truth geometry, and poses

We compare the performance of

In each test, all frameworks are configured with identical voxel sizes, perception ranges, and calibrations. To ensure fairness, we also use the same data loader for all frameworks.

Performance

The following metrics are compared:

  • RAM: Total memory used by the map-building process, including the data loader
  • Map size: Memory used by the map only
  • CPU time: Total computation time used by the map-building process
  • Wall time: Real-time duration to complete mapping
  • AUC: Area under the ROC curve, which quantifies overall classification performance

Note that dividing the CPU time by the wall time yields the CPU utilization, which can be higher than 100% on multi-core systems.

Newer College 20cm

Framework RAM (MB) Map size (MB) CPU time (s) Wall time (s) AUC
supereight2 249.25 102.80 455.87 59.74 0.87
octomap 198.49 19.82 688.71 709.99 0.82
voxblox 279.07 63.25 219.13 40.06 0.92
ours (beams) 153.31 13.80 92.30 41.38 0.91
ours (rays) 148.70 13.17 69.18 35.52 0.91

Newer College 5cm

Framework RAM (MB) Map size (MB) CPU time (s) Wall time (s) AUC
supereight2 2842.90 2225.81 3276.71 390.95 0.89
octomap 14067.15 935.58 36252.70 35790.60 0.90
voxblox 3643.60 2253.13 1717.05 142.29 0.97
ours (beams) 871.07 576.56 2384.99 194.63 0.97
ours (rays) 826.55 409.39 1152.18 113.68 0.96

Panoptic Flat 5cm

Framework RAM (MB) Map size (MB) CPU time (s) Wall time (s) AUC
supereight2 158.82 43.96 26.85 3.64 0.93
octomap 158.55 6.19 130.32 129.00 0.95
voxblox 260.06 35.19 63.14 8.98 0.99
ours (beams) 122.03 6.44 6.42 3.59 0.99
ours (rays) 124.53 7.10 4.32 3.02 0.99

Panoptic Flat 2cm

Framework RAM (MB) Map size (MB) CPU time (s) Wall time (s) AUC
supereight2 443.50 271.86 46.04 5.87 0.95
octomap 6057.02 48.58 773.16 763.39 0.99
voxblox 670.52 332.02 237.53 21.37 1.00
ours (beams) 269.81 58.44 53.78 7.79 1.00
ours (rays) 261.01 61.94 25.02 4.92 1.00

Accuracy over distance

The graph below shows how accuracy varies with the distance to the surface. A high accuracy just behind the surface (small negative distances) corresponds to a high recall on thin objects. A smaller dip at the surface (distance 0.0) indicates better classification performance in challenging areas, improving safety in near-surface operations.

all_datasets

Checklist

General

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated tests as required
  • Any required changes in dependencies have been committed and pushed

Documentation (where applicable)

  • I have updated the installation instructions (in docs/pages/installation)
  • I have updated the code's inline API documentation (e.g., docstrings)
  • I have updated the parameter documentation (in docs/pages/parameters)
  • I have updated/extended the tutorials (in docs/pages/tutorials)

@victorreijgwart victorreijgwart added documentation Improvements or additions to documentation enhancement New feature or request labels Nov 19, 2024
@LionelOtt
Copy link
Contributor

Looks good overall to me, think once the code to prevent it causing compile errors on non-linux systems is in we're good. Maybe adding the "summary generator" could be nice for this PR as well.

Copy link
Contributor

@LionelOtt LionelOtt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good with the latest modifications.

@victorreijgwart victorreijgwart merged commit fa4b8e2 into main Nov 20, 2024
25 checks passed
@victorreijgwart victorreijgwart deleted the feature/report_resource_usage branch November 20, 2024 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants