Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Q: filter results by label #698

Open
deniszh opened this issue Nov 23, 2024 · 6 comments
Open

Q: filter results by label #698

deniszh opened this issue Nov 23, 2024 · 6 comments

Comments

@deniszh
Copy link

deniszh commented Nov 23, 2024

Hello,
I have quite unorthodox setup and wondering could promxy help me with it.
I'm wondering why metrics_relabel_configs do not support other actions, like "keep". Let me describe my use case in a bit more details.
We have prometheuses which contain data for many clusters, hence metric have "cluster" label applied. I want to have ability to filter results by this label, so, something like:

 server_groups:  
    - targets:
            - url1
    metrics_relabel_configs:
        - action: keep
          source_label: cluster
          regex: cluster1
    - targets:
            - url2
    metrics_relabel_configs:
        - action: keep
          source_label: cluster
          regex: cluster2

Then, results from url1 will contain only metrics with {cluster="cluster1"} and results from url2 will contain only metrics with {cluster="cluster2"}.

As far as I understand it's not possible in current implementation. But I'm wondering is it possible to implement this in theory and what's needed for this. I'm interested in implementing this by myself even, just need a hint where to go. I tried to add "relabel.Keep" to metric_relabel.go but it's not working properly.

Thanks!

@deniszh
Copy link
Author

deniszh commented Nov 25, 2024

Upd:
I put following to "ToRelabelConfig()"

	case relabel.Keep:
		cfg = &relabel.Config{
			Action:       c.Action,
			SourceLabels: model.LabelNames{c.SourceLabel},
			Regex:        regex,
		}

(with proper Regex generation - .*sourcelabel.*) - but with configuration above for query

count(up) by (cluster)

I'm getting

{} | 53770
-- | --
{cluster="cluster1"} | 303

I.e. it's keeping empty metrics with no labels for some reason...

@deniszh
Copy link
Author

deniszh commented Nov 28, 2024

OK, nvm, I implemented it in a quite crude way - just removing empty metrics after. I did it in https://github.com/deniszh/promxy/tree/metric-filtering, if someone curious, but I understand that my use case is quite uncommon, not sure if we should port that to upstream.

@deniszh deniszh closed this as completed Nov 28, 2024
@jacksontj
Copy link
Owner

jacksontj commented Dec 7, 2024

Then, results from url1 will contain only metrics with {cluster="cluster1"} and results from url2 will contain only metrics with {cluster="cluster2"}.

If I'm understanding this correctly its actually pretty close. The issue here is that metrics_relabel_configs are a bit confusing. These are used to filter the targets coming from the servergroup discovery. Given that you are defining tarets (presumably statically) this isn't actually necessary. So if you want to filter the labels going downstream you can use label_filter (https://github.com/jacksontj/promxy/blob/master/pkg/servergroup/config.go#L183-L207) -- likely the static_labels_include setting.

I believe that should cover your use-case; if not I might need a bit more explanation to understand it better :)

@deniszh
Copy link
Author

deniszh commented Dec 10, 2024

Thanks, @jacksontj , but it's not. Let me add something here, maybe for some poor souls who need the same.

So, bit more context first.
Assume you have following setup

Grafana DS A -> PrometheusA

Grafana DS B -> PrometheusB

So, two data sources in Grafana, for 2 separate instances of Prometheus serving data for some clusters and have corresponding "cluster" label. (in reality it can be not 2 but 10, 20 or 100).
Then, you migrate to some other solution and decides to merge all data in single entity (Mimir / Thanos / Bigger prometheus / whatever), let's name it PrometheusAB.
Then you can repoint all Grafana Datasources to new instance, like this:

Grafana DS A -> PrometheusAB

Grafana DS B -> PrometheusAB

But. Your users already expected that DS A contain only data from A, and DS B contain only data from B, and they already built 1000000 dashboards and alerts around that fact. :)

So, that's exactly was my problem. I tried label_filter with static label - and it didn't work.
And initially I did some ugly implementation of keep relabeling in https://github.com/deniszh/promxy/tree/metric-filtering - and it works up to some extent, for example if PrometheusA contain data with "cluster=A" and PrometheusB contain data with "cluster=B" then my patch give proper result for

count(up) by (cluster)

-- | --
{cluster="clusterA"} | 303

But then I relaize that some queries DO NOT CONTAIN cluster label in results, and my patch just cut all result and return nothing. And I just need to inject "cluster=A" label to all outgoing requests instead!

I was running out of time, so, I just picked prom-label-proxy for injection, but it doesn't support any auth, so, I did ugly setup like

Grafana DS A -> PromxyA (insert proper header) -> prom-label-proxy (takes header and insert label to query) -> PromxyAuth (doing auth) -> PrometheusAB

Grafana DS B -> PromxyB (insert proper header) -> prom-label-proxy (takes header and insert label to query) -> PromxyAuth (doing auth) -> PrometheusAB

real setup is bit more complex, because I'm doing merging with many Prometheuses. And I share Auth Promxys and prom-label-proxy installation between many DS.

So, in theory I can just add injectproxy support to promxy and greatly simplify this, but was distracted by other stuff.

@jacksontj
Copy link
Owner

Oh, so your explanation of PrometheusAB makes a LOT of sense.

I'm a bit unsure as to why the label_filter didn't work -- I would expect a label_filter with a static_labels_include should do the trick -- the key being that you'd need 2 promxy endpoints (as you want different configuration in each).

  1. metrics_relabel doesn't work -- since it just filters the endpoints from the discovery to pull (and in this case we want all of them)
  2. label_filter only filters the downstream -- but doesn't add the matcher of cluster=A
  3. labels would only add the label cluster=A to responses but not filter.

I'll start out by saying -- I haven't heard of anyone solving this problem yet; so we are in somewhat uncharted territory :)

So if my understanding above is correct what you need is more-or-less half of labels and label_filter -- you want to always add a label matcher to all downstream queries for a given promxy -- is that correct? If so I'm not sure what to call that -- its basically somethi8ng like static_matchers (meaning matchers to always be injected in queries to that servergroup).

@jacksontj jacksontj reopened this Jan 8, 2025
@deniszh
Copy link
Author

deniszh commented Jan 8, 2025

Yes, you described that perfectly. Name can be eg inject_matchers or what you proposed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants