Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ingest-limits): Add GRPC service to read per tenant stream limits from kafka #15668

Merged
merged 12 commits into from
Jan 10, 2025

Conversation

periklis
Copy link
Collaborator

@periklis periklis commented Jan 9, 2025

What this PR does / why we need it:
Adds basic service to read per stream metadata from the separate topic (introduced by #15648) and serve them per tenant for other services (e.g. distributors, ingest-limits-frontend). The present implementation returns the count of recorded streams from the metadata topic over the configured window size (i.e. defaults to 1m). In addition pods of this service use a round-robin balance to read from partitions, which effectively means that duplicate data are possible in case of rebalancing events.

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

  • The implementation removes the lastSeenAt field as introduced in feat(distributor): Add stream metadata writes to separate topic #15648 in favor of the record's timestamp.
  • The GRPC response protobuf includes a reference to the stream rate message but is not using it yet. For the time being it is more a proposal but needs scrutiny if we want to keep or replace with a custom `GetRateLimitPerStream(tenantID, streamHash).
  • The kafka records are kept into a single map tenantID -> streamHash -> record.Timestamp and protected via a single RW mutex. If we see contention later on we could consider striping the locks over tenants?!? AFAICS we would fetch the limits per distributor request or introduce a periodic recheck to have less contention !?!
  • The ServeHTTP handler is a temporarily introduced to inspect the service when developing. This can be removed once we have the ingest-limits-frontend calling the service and the distributor enforcing limits.

Checklist

  • Reviewed the CONTRIBUTING.md guide (required)
  • Documentation added
  • Tests updated
  • Title matches the required conventional commits format, see here
    • Note that Promtail is considered to be feature complete, and future development for logs collection will be in Grafana Alloy. As such, feat PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
  • If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

@periklis periklis self-assigned this Jan 9, 2025
@periklis periklis force-pushed the ingest-limiter-service branch from 6ca29f9 to 1e9e2df Compare January 9, 2025 10:42
@periklis periklis changed the title feat(ingest-limiter): Add basic service feat(ingest-limits): Add GRPC service to read per tenant stream limits from kafka Jan 10, 2025
@github-actions github-actions bot added the type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories label Jan 10, 2025
Copy link
Contributor

github-actions bot commented Jan 10, 2025

💻 Deploy preview deleted.

@periklis periklis marked this pull request as ready for review January 10, 2025 09:37
@periklis periklis requested a review from a team as a code owner January 10, 2025 09:37
@periklis periklis force-pushed the ingest-limiter-service branch from 2515c70 to 9b37696 Compare January 10, 2025 09:52
metadata: make(map[string]map[uint64]int64),
}

if cfg.IngestLimits.Enabled {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to wrap the call to NewIngestLimits with cfg.IngestLimits.Enabled instead of put it in NewIngestLimits? It feels weird to return a partial initialized struct. Is that a pattern we use elsewhere?

// the metadata map. If Kafka is not enabled, it simply waits for context cancellation.
// The method also starts a goroutine to periodically evict old streams from the metadata map.
func (s *IngestLimits) running(ctx context.Context) error {
if !s.cfg.Enabled {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is also kind of odd?

@@ -358,6 +359,7 @@ type Loki struct {
tenantConfigs *runtime.TenantConfigs
TenantLimits validation.TenantLimits
distributor *distributor.Distributor
ingestLimits *limits.IngestLimits
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, it's because it's registered as a module we have that weird initialization issue?

@@ -382,6 +384,29 @@ func (t *Loki) initDistributor() (services.Service, error) {
return t.distributor, nil
}

func (t *Loki) initIngestLimits() (services.Service, error) {
if !t.Cfg.KafkaConfig.IngestLimits.Enabled {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it seems not, we have a second guard here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answering here for the above. The guards above are obsolete and predate the initIngestLimits, so keep this one here and remove the above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answering here for the above. The guards above are obsolete and predate the initIngestLimits, so keeping this one here and remove the above.

@periklis periklis merged commit 97c7347 into feat/usage-tracker Jan 10, 2025
60 checks passed
@periklis periklis deleted the ingest-limiter-service branch January 10, 2025 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/L type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants