You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I recently hit this during a cloud handover review. We had an environment with landscape-server and postgresql, both with grafana-agent subordinates related through the cos-agent relation.
It seems that grafana agent does not support such deployments. I saw that in a 3 machine cluster, one had jobs for landscape-server while the others had jobs for postgresql; neither had jobs for both.
Likely it is due to a single grafana agent instance being run, using the same config file: /etc/grafana-agent.yaml
Unfortunately, I didn't see a clear way of getting this to work where both apps' jobs would be included in grafana-agent.yaml. We needed to fall back to using the nrpe charm and cos-proxy to address the alerts for one of the apps.
To Reproduce
Not providing the bundle since it's for a customer, but it's pretty simple:
Deploy a 3 unit postgresql cluster. This was tested with the 14/stable channel, rev 468.
Deploy a 3 unit landscape-server cluster, onto the same machines as the above cluster. This was tested with the latest/stable channel, rev 121.
Deploy the grafana-agent charm. This was tested with the latest/stable channel, rev 223.
Relate the grafana-agent charm to both of the other charms.
Observe the rendered /etc/grafana-agent.yaml file on each of the 3 machines. None of the machines will have jobs for both of the apps; it's either one or the other.
Environment
This was tested on Juju 3.4.6 on Azure, although the cloud likely does not matter in this case.
Relevant log output
Just look at the /etc/grafana-agent.yaml file. You can grep for these patterns:
"job_name: charmed-postgresql""job_name: landscape-server"
Based on the reproducer on this ticket, only one of those will return values due to the race between the two grafana-agent subordinates running on the same machine.
Additional context
postgresql and rabbitmq-server were intentionally put on the same machines in order to reduce how many Azure VMs were needed for the project. Separating them would require additional VMs and thus likely additional cost. (If this were a MAAS/LXD cloud instead, splitting them apart into separate containers would be the obvious workaround.)
The text was updated successfully, but these errors were encountered:
Vultaire
changed the title
grafana-agent doesn't support having units running on the same machine
grafana-agent doesn't support having multiple units running on the same machine
Dec 6, 2024
lucabello
transferred this issue from canonical/grafana-agent-k8s-operator
Jan 17, 2025
@lucabello Just as feedback: no. The environment deliberately had postgres and landscape-server colocated to keep the overall VM count down to reduce cost for a customer.
If this is a limitation that we have to live with for this charm, fine, but whatever replaces this should have this use case in mind - otherwise we'll simply be forced to continue to rely on the nrpe charm and cos-proxy.
Bug Description
I recently hit this during a cloud handover review. We had an environment with landscape-server and postgresql, both with grafana-agent subordinates related through the cos-agent relation.
It seems that grafana agent does not support such deployments. I saw that in a 3 machine cluster, one had jobs for landscape-server while the others had jobs for postgresql; neither had jobs for both.
Likely it is due to a single grafana agent instance being run, using the same config file: /etc/grafana-agent.yaml
Unfortunately, I didn't see a clear way of getting this to work where both apps' jobs would be included in grafana-agent.yaml. We needed to fall back to using the nrpe charm and cos-proxy to address the alerts for one of the apps.
To Reproduce
Not providing the bundle since it's for a customer, but it's pretty simple:
Environment
This was tested on Juju 3.4.6 on Azure, although the cloud likely does not matter in this case.
Relevant log output
Additional context
postgresql and rabbitmq-server were intentionally put on the same machines in order to reduce how many Azure VMs were needed for the project. Separating them would require additional VMs and thus likely additional cost. (If this were a MAAS/LXD cloud instead, splitting them apart into separate containers would be the obvious workaround.)
The text was updated successfully, but these errors were encountered: