-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set team on Alert Group based on route #3459
Comments
The current version of Grafana OnCall, at the time this issue was opened, is v1.3.64. If your issue pertains to an older version of Grafana OnCall, please be sure to list it in the PR description. Thank you 😄! |
We also like this feature! |
We really want this feature as well! |
We really need this feature. We cover 12 teams in one integration and the Alert Groups are all not assigned to a team. Only workaround sort of is assigning Labels to the alert groups. |
We are in the same boat, we manage single integration, but routing to multiple teams. Implement it please! |
Grafana team, it might be possible that I could get some work done for this ticket. There's two approaches (though not mutually exclusive, so both could be implemented if we wanted). Inherit the team from the escalation chain
Escalation step
I'd love some advice as to which you'd prefer implemented first or any other changes so that if we do get some time to work on this its ready to go. |
Hi! Thanks for filing the issue, the details and research around this. Changing the team assigned to an alert group may not be that simple. Right now there is no team stored along an alert group, but it is got from (and filtered via) the integration it was received. In that sense, it seems it should be possible to switch this to use escalation chain team instead (but as you noted that would mean a behavior change that could have unexpected results, besides not every alert group will have an escalation chain necessarily). A different alternative I can think of is to make it possible to associate multiple teams to an alert group, so you can have the team from the integration, but also inherit the team from the escalation chain (if different) and even add teams arbitrarily in the future (we would need to discuss this with the team). OTOH, and to understand the use case (and the possible solutions), is this mostly for tracking metrics / filtering alert groups? (right now that's what you get by setting up teams) what other scenarios you are trying to solve? Thanks! |
Besides using the escalation chain, another option to assign the team name to an Alert Group could be within
Besides reporting/filtering, there is also a permission piece to it. Every record (integration record and alert groups that have no team assigned are visible and accessible by everybody who has basic permissions. Restricting access to the users assigned specific team requires a team name for these records. I am actually also annoyed about integrations labeled as "No Team" as well. These integrations are accessible by ALL users with basic integration permissions. |
Filtering and tracking metrics is the key outcome we are looking for. Hadn't considered the permissions side of things as it doesn't apply to our use case but certainly seems like something that needs resolving as well. |
@matiasb I've been working through / thinking about implementation of this again and wondering a rough where to start. My thoughts based on your comments are as follows
* I believe the metrics exporter exports a single team property, so for multi team support a change here would need to be considered. Would it make sense to start working on point 1 while point 2 is being figured out? Btw I joined the community slack under |
Makes sense 👍 I would still define a M2M model between alert group and team, even if we restrict one team per alert group at first.
Right, having the model decided and defined we should be able to work on any of these (or variations) of this possible paths.
That's a good point, and this should probably be part of the work in point 1 (ie. changing the model will require us to update how we calculate the metrics; also the team filter for insights is now based on the team from the integration, so some extra work may be needed to keep things consistent).
Sounds good!
I see :-) feel free to ping me there to talk any details (I'm @matiasb there too) Thanks for pushing this forward! |
I am not familiar with your code nor OnCall internals, but adding support for multiple teams per Alert Group sounds to me |
Yeah, in any case I wouldn't enable multiple teams support at first but I would try to keep that in mind. About permissions, giving it a quick thought, I think as long as one of the teams allow you to access the alert group, you should be able to see it (right now there is no perms per team, only allow users outside the team to access things or not; per-user perms work at a global level too, ie. you cannot restrict access to an integration or schedule). |
Update on this so far:
Todo / things to consider:
|
One thing I noticed is that performing |
Nice!
Backfilling existing alert groups could be not that simple for setups where you have millions of alert groups, so not requiring a migration or making it online somehow could be better.
👍
Right now each team can decide if their resources are only visible to team members or anyone in the organization (via Settings -> Team and Access Settings). I think that if an alert group belongs to a team that allows anyone to get access, then it should be possible to check that alert group details (otherwise, you should be required to be member of any of the associated teams). What do you think? In any case, handling multiple teams can be left for a future iteration too. |
There are some changes in progress related to how the frontend bits are managed. I guess any issues around this should be workable later if needed. |
Thanks for the updates, this is really useful! Progress has been a bit slower than I hoped due to this being a lower priority task and few important things popping up. Progress report
BackfillingUnderstood - I've reworked my approach to fallback to integration when there's otherwise no value. This has made things easier as I don't need to write a migration for it 😅 Insights stuffSo I think I have the promethus / insights stuff mostly working. The implementation I have right now uses a tuple with integration id + team id
but then with many of the helper.py related functions we don't have an alertgroup to work with at the moment I'm doing For example:
I'm not sure if this is a performance concern I should worry about or not. I'm presuming most orgs have tens of teams with larger orgs many having hundreds? But I don't really have the intel to know. I could rework this to have another layer of cache which stores what teams the exporter needs to care about but I'm worried about the complexity that might bring. |
Nice! 👍
Curious, the idea is that the integration team is lost if another team is added? How complex does the alert group team filtering logic get handling both cases?
If an alert group has more than a team associated, it will be counted for each team, right?
In the above case, it may make sense to query all the teams in the org once, outside the loop and reuse that? Using some prefetch in the query may also help. But I could be missing context, do you have a WIP PR or branch to reference? Thanks! |
Very much WIP - #5320 - I'm not even sure if things like exporter / metric calls for updating teams make sense with this approach but haven't had time to get my head around yet. And I haven't reviewed any of these changes.
I've placed the check in the api serialiser so the filtering logic doesn't really change. The limitation I can see with this approach is that right now you can't configure it to "remove" or I guess unset the team. We could have this show both teams (the integration + the alert group) easily as well, but I didn't feel like there was much of a use case for that?
It should do! I haven't tested and confirmed it and I'm not very confident with my changes in this space yet. |
Just realised that I will need to add to the filtering logic in the view API. Will have a think about this |
I've updated the PR and put some comments around it. Would love some comments on how to move it to the next stage, or if anyone else would like to pick it up and run with it? It works locally but I haven't used it in anger yet nor at scale. I also can't run a lot of the automated tooling due to not having a Docker licence / ability to run tilt. |
What would you like to see!
An integration might be used by multiple teams and then routed to the correct team using the routes/channel filters. The team for the alert group however seems to be set by the integration team setting and there is no way to change the team during routing.
Product Area
Alert Flow & Configuration
Anything else to add?
Alternatively this might be something that could be added to an escalation step?
The text was updated successfully, but these errors were encountered: