Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a Tower template for adding/updating the CheckMK agent #5718

Open
2 of 8 tasks
acozine opened this issue Jan 9, 2025 · 3 comments
Open
2 of 8 tasks

Create a Tower template for adding/updating the CheckMK agent #5718

acozine opened this issue Jan 9, 2025 · 3 comments
Assignees
Labels
feature Operations pulls issues into the Operations ZenHub board

Comments

@acozine
Copy link
Contributor

acozine commented Jan 9, 2025

User story

As an engineer, I want CheckMK to stay up-to-date with any changes (new services installed, replacement VMs, etc.).

Acceptance criteria

  • I can run the utils/checkmk_agent.yml playbook from a template in Tower and install the CheckMK agent on a single VM with --limit <vmname>
  • I can run the utils/checkmk_agent.yml playbook from a template in Tower, update all the monitored services on existing VMs, and ensure that the CheckMK agent is installed on all VMs in an environment with --limit <envname>

Concrete example

If I run the Tower template with --limit sandbox-xk2843.princeton.edu

Implementation notes, if any

  • create the template
  • install the checkmk.general collection on an EE (either update an existing EE or build a new, special-purpose one)
  • add a documentation line to the playbook noting that to run it locally, you need to install the checkmk.general collection
  • test the template on a new VM to confirm that the playbook installs the agent
  • test the template on a replacement VM to confirm that the playbook updates data collection
  • test the template on a VM with recent changes to confirm that the playbook updates monitored services
@acozine acozine added feature Operations pulls issues into the Operations ZenHub board labels Jan 9, 2025
@acozine acozine self-assigned this Jan 21, 2025
@acozine
Copy link
Contributor Author

acozine commented Jan 22, 2025

The CheckMK collection was added to our standard/base EE by #4783. However, the EE was not rebuilt at the time - rebuilt it now, and we're on to the next problem.

@acozine
Copy link
Contributor Author

acozine commented Jan 22, 2025

Playbook currently fails for QA hosts - both locally and on Tower - on the Linux - Download Vanilla CRE agent task with this error:

fatal: [lib-postgres-qa1.princeton.edu]: FAILED! => {"attempts": 3, "changed": false, "dest": "/tmp/check-mk-agent_2.2.0p22-vanilla.deb", "elapsed": 0, "msg": "Request failed: <urlopen error [Errno 111] Connection refused>", "url": "http://localhost:80/my_site/check_mk/agents/check-mk-agent_2.2.0p22-1_all.deb"}

This failure is caused by an issue with variables. The changes in #5389 should start to fix this - we also need to update vars for the new server name.

@acozine
Copy link
Contributor Author

acozine commented Jan 24, 2025

Today's blocker (when running locally on Francis' laptop) is that when adding a new VM to CheckMK Enterprise edition, the VM cannot download the agent file until/unless it has been manually added as a host in the UI first - adding the VM as a host adds it to the allow list for the default agent configuration. Ideally we would have a default "any VM we want" or "any VM in these folders" agent configuration, then add configurations with more restrictions for specialized use cases. We don't understand the Agent Bakery well enough yet to know how to do that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Operations pulls issues into the Operations ZenHub board
Projects
None yet
Development

No branches or pull requests

2 participants