-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory profiler #1201
base: main
Are you sure you want to change the base?
Memory profiler #1201
Conversation
@alv-around very exciting! I'm not sure about prometheus-client vs metrics-process. I suggested the latter just because that's what I saw reth used. I haven't looked into the PR yet, but I guess there are two different approaches:
It sounds like you currently went with (1)? It'd be nice if we could also support (2), or maybe feature flag between them somehow. I'll have more concrete suggestions after I look at the code. also tagging @luffykai to take a look |
crates/sdk/src/lib.rs
Outdated
@@ -148,6 +178,8 @@ impl Sdk { | |||
VC::Periphery: Chip<SC>, | |||
{ | |||
let app_prover = AppProver::new(app_pk.app_vm_pk.clone(), app_committed_exe); | |||
#[cfg(feature = "memory_profiler")] | |||
self.profiler.update_memory_usage(Method::Proof); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the issue is this only gets the memory usage at this point in time, but I think isn't able to capture peak memory usage during the overall function span?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing progress so far!
Adding to my other comments after taking a skim: I think if possible I'd like something where: the memory profiling is a separate loop/thread, we collect metrics via recorder and the different sdk functions are scoped by labels (see https://github.com/openvm-org/openvm/blob/main/docs/crates/metrics.md), and then in some separate process/post-process in openvm-prof
(crates/prof
) we can set up some analyzer or grafana dashboard that then let's us put more concrete numbers to different workloads.
@jonathanpwang yes! completely agree with your feedback. the last time I checked the issue was last year, and I was completely unaware of the process-metrics. The process-metrics way and your proposal, makes much more sense :) Just a final notice, I would go for threads over async loops, as proving will block most likely the metrics measurement. |
1 similar comment
@jonathanpwang yes! completely agree with your feedback. the last time I checked the issue was last year, and I was completely unaware of the process-metrics. The process-metrics way and your proposal, makes much more sense :) Just a final notice, I would go for threads over async loops, as proving will block most likely the metrics measurement. |
Hi @jonathanpwang, here is a small status update: So I haven't got to sit much on it, and I struggle a bit in adding the Appart from that, currently I set up the "prometheus push gateway", basically it pushes every 10 seconds the process-metrics to a middle instances from which the "classical" can pull the data. This has the advantage thas is: lightweight and runs better with batch jobs. The alternative, of opening an endpoint for prometheus add a lot of depencies to the create (~around 60) and it does not work very well. This alternative would work for an api that wraps around the Sdk, and if that is goal, we might be bette off adding the metrics collection to the api than to the sdk itself. Sofar, not to happy with the outcome... Anyways, just wanted to keep you in the loop. Let me know if you have any remarks. |
Here is a minimal implementation of memory profiler for the sdk (#1029) using the official prometheus-client (see also here).
Currently the PR does:
memory_profiler
feature and provides a breakdown text of the memory consumption of each method once the sdk is dropped. bellow the test sample of runningcargo run --example sdk_app --features memory_profiler
:memory_profiler
feature, so I don't need to touch existing code. I wanted to avoid making any change to the sdk interface, without getting more feedback.Possible adjustments to the code:
Sdk
struct...
These I some ideas that I had, but I wanted to share them here before jumping into the implementation. Any other ideas not cover here are more than welcome.
EDIT:
Ok, I just saw this tip, I may come back and swap prometheus-client for metrics-process 🫠