-
Notifications
You must be signed in to change notification settings - Fork 465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adapter,controller: add replica metrics history #29254
adapter,controller: add replica metrics history #29254
Conversation
8efe34a
to
eab0341
Compare
dd399ae
to
5f544d7
Compare
5f544d7
to
ec750bf
Compare
This commit adds a new builtin source, `mz_cluster_replica_metrics_history`, and a corresponding view `mz_cluster_replica_utilization_history` on top of it. They contain the data already present in `mz_cluster_replica_{metrics,utilization}`, but as (pseudo-) append-only collections, rather than retained-metrics collections. This plan is to remove the old retained-metrics collections in favor of the new append-only ones, but we need to wait until the new ones had time to backfill the historical data present in the old ones. This commit does not include any code to partially truncate `mz_cluster_replica_metrics_history`, which is something we want to ensure it doesn't grow unboundedly. A subsequent commit will deal with that.
This commit adds truncation of the replica metrics history collection during bootstrap. In contrast to the existing status history collections, this collection is truncated based on the age of events instead of the number of events per ID. The implementation of the truncation method is kept simple. In the future it can be refined to avoid reading the entire collection into memory, if it turns out that the size of a metrics collection is problematic.
ec750bf
to
deeaca1
Compare
ScalarType::TimestampTz { precision: None }.nullable(false), | ||
) | ||
.finish() | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't really fit into healthcheck.rs
, but we already have the statement execution history stuff here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adapter changes LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only looked at the new truncation method, and that one looked good to me!
TFTRs! |
This PR adds a new builtin source
mz_cluster_replica_metrics_history
, and a corresponding viewmz_cluster_replica_utilization_history
on top of it. They contain the data already present inmz_cluster_replica_{metrics,utilization}
, but as (pseudo-) append-only collections, rather than retained-metrics collections.This plan is to remove the old retained-metrics collections in favor of the new append-only ones, but we need to wait until the new ones had time to backfill the historical data present in the old ones.
Motivation
Part of https://github.com/MaterializeInc/database-issues/issues/8403
Tips for reviewer
Some of our user docs (Troubleshooting, Grafana, Datadog) refer to
mz_cluster_replica_utilization
. I'm not touching these here, but we'll need to change them when/before we remove that relation.Checklist
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way), then it is tagged with aT-proto
label.