Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactors topdown service to support multiple types of calculations and adds support for Sapphire Rapids #576

Merged
merged 11 commits into from
Oct 25, 2024

Conversation

ilumsden
Copy link
Contributor

This PR refactors the topdown service to allow it to be easily extended to support top-down calculations for processors besides Haswell/Broadwell. To do this, I've made the following changes:

  • Creates a new TopdownCalculator base class to represent calculations for different processors
  • Moves the existing calculations to a new HaswellTopdown class, which inherits from TopdownCalculator
  • Refactors IntelTopdown to offload all the computation and associated tracking to a pointer to a TopdownCalculator object

This PR also uses this new infrastructure to add support for Sapphire Rapids and Emerald Rapids CPUs. The calculations for these CPUs were obtained from Intel's perfmon repo, specifically the following file: https://github.com/intel/perfmon/blob/main/SPR/metrics/perf/sapphirerapids_metrics_perf.json. The support added by this PR only covers the first two levels of the top-down hierarchy. In the future, this could be expanded to cover as much of the 6 levels for Sapphire Rapids as desired.

This PR is still work-in-progress. The following tasks need to be completed before this is ready for review:

  • Add a CMake mechanism to specify an architecture (currently planning on supporting architecture names from archspec)
  • Use that mechanism to select (at build time) the correct TopdownCalculator child class to use in the topdown service
  • Examine if any changes need to be made to the topdown built-in option

@ilumsden ilumsden marked this pull request as draft July 21, 2024 00:25
@ilumsden
Copy link
Contributor Author

In the future, it might be useful to consider embedding archspec-json if more architecture-specific features are added to Caliper.

@ilumsden
Copy link
Contributor Author

ilumsden commented Sep 9, 2024

Outstanding work on this PR:

  • Find a way to use PAPI multiplexing with the SPR counters needed for topdown Deferring as future work since it is not needed at this time
  • Add testing
  • Update documentation (if it exists) about the topdown service

@ilumsden ilumsden marked this pull request as ready for review October 9, 2024 21:01
@ilumsden
Copy link
Contributor Author

ilumsden commented Oct 9, 2024

This PR is fully tested on Poodle and is now ready for review. There are only a few outstanding things left, namely adding documentation.

@ilumsden ilumsden force-pushed the topdown-spr branch 2 times, most recently from d97ca7b to 20f37a0 Compare October 21, 2024 17:51
@daboehme
Copy link
Member

daboehme commented Oct 25, 2024

Hi @ilumsden, there's a new change in Caliper this week that breaks up the giant option spec string in controllers.cpp. That was mainly because we ran into MSVC's string literal size limit, but it also makes it so we can pick options based on the build configuration. I think that's a better approach for selecting the topdown options than adding a new ConfigManager function. Can you try and adapt the PR to the new approach? Essentially you can define separate option spec strings for each configuration in controllers.cpp and then add whichever one is appropriate to the builtin_option_specs_list in ConfigManager.cpp. Should be pretty self-explanatory. Thanks!

@ilumsden
Copy link
Contributor Author

@daboehme I've worked that change into the PR. I did have to make one small change to that mechanism to handle architecture detection. I had to move where builtin_option_specs_list gets populated from global scope to the constructor of ConfigManagerImpl due to having to do string comparison (which is extremely difficult to do at compile time in C++11).

@daboehme daboehme merged commit b7139cc into LLNL:master Oct 25, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants