Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support bootc #830

Closed
wants to merge 15 commits into from
Closed

Conversation

bshephar
Copy link
Contributor

This PR adds a number of changes to roles in order to facilitate the use of image mode RHEL.

Copy link
Contributor

openshift-ci bot commented Nov 27, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@@ -36,3 +36,4 @@ edpm_nova_compute_config_dir: /var/lib/config-data/ansible-generated/nova_libvir

# KSM control
edpm_kernel_enable_ksm: false
edpm_use_bootc: false
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need a better way to implement this globally. But at least for testing purposes, this is what I've used to get something that deploys.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why I added the edpm_bootc role in #813 so that we had a way to do it consistently across anywhere that needs it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I created a new bootc branch for edpm-ansible: https://github.com/openstack-k8s-operators/edpm-ansible/tree/bootc

Can you propose this PR to the bootc branch instead?

I'll be reverting #813 from main until we are ready to merge all bootc support into main.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Branch thing is done. But that role would need to be called from each and every playbook to detect and set the bootc variable right? I guess we can just add it as a ansibleVar and avoid calling the role each and every time we start a new service.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we can use a custom fact. For example:

cat /etc/ansible/facts.d/bootc.fact
#!/usr/bin/env bash

is_bootc() {
  BOOTC_STATUS=$(sudo bootc status --json | jq .status.type)
  if [[ "$BOOTC_STATUS" == \"bootcHost\" ]]; then
     BOOTC_SYSTEM="true"
  else
     BOOTC_SYSTEM="false"
  fi
}

is_bootc

echo ${BOOTC_SYSTEM}

This is good from the perspective of not needing the user to manually define that they are using a bootc system, plus it works for our non bootc systems:

[m3@osp-df-3 bootc]$ ansible -i inv.yaml all -m setup -a "filter=ansible_local"
edpm-compute-1 | SUCCESS => {
    "ansible_facts": {
        "ansible_local": {
            "bootc": true
        },
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": false
}

So you could combine them in the same NodeSet if you wanted to. But, the down side of this approach is that we need to gather facts from each service. We have thus far tried to limit the amount of fact gathering required, so this approach my not be what we want to do without some more granular control of which facts are being gathered in each service. At the moment, we just define a variable for gather_facts. If that variable is true, then we gather all facts. Obviously, that becomes necessary if we want to allow individual task executions that require facts, but when we want to just gather local facts, then gathering all of them introduces non-trivial time to our executions of each service.

Offering it as a potential solution that we can debate. The alternative is that we require either bootc or non-bootc nodes in each NodeSet.

- name: Push script
ansible.builtin.copy:
dest: /usr/local/sbin/containers-tmpwatch
dest: /var/lib/openstack/cron/containers-tmpwatch
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/usr is immutable with bootc deployments. So I've proposed doing this in two different ways. 1, we bake the scripts into the container file:
https://github.com/openstack-k8s-operators/install_yamls/pull/950/files#diff-f8fb9af5355b45b9ca8936bf0d721c6f0e37e13b637f5598e2be19995dea23e7R45-R46

And 2. Which is this method of writing to /var/lib/openstack. I personally prefer doing it this way if we can agree on a common place for any scripts that we want to use. That saves us baking things into images and then trying to keep them in sync. Better imo to have them in edpm-ansible for now.

Comment on lines 36 to 47
ansible.builtin.include_role:
name: osp.edpm.edpm_container_standalone
vars:
edpm_container_standalone_service: ovn_controller
edpm_container_standalone_container_defs:
ovn_controller: "{{ lookup('template', 'ovn_controller.yaml.j2') | from_yaml }}"
edpm_container_standalone_kolla_config_files:
ovn_controller: "{{ lookup('template', 'kolla_ovn_controller.yaml.j2') | from_yaml }}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this needs to stay in order to support both deployment methodologies. It can just be conditional like:
https://github.com/openstack-k8s-operators/edpm-ansible/pull/830/files#diff-34e3323585e197e806d463771e3b5132716048c41818b1318fecb2c0d8e36cd6R45

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/3d020a0278384b7f97de3e8e26403819

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 53m 23s
podified-multinode-edpm-deployment-crc FAILURE in 1h 42m 47s
cifmw-crc-podified-edpm-baremetal FAILURE in 1h 38m 46s
edpm-ansible-tempest-multinode FAILURE in 1h 48m 18s
✔️ edpm-ansible-molecule-edpm_bootstrap SUCCESS in 7m 03s
✔️ edpm-ansible-molecule-edpm_podman SUCCESS in 6m 11s
✔️ edpm-ansible-molecule-edpm_module_load SUCCESS in 4m 40s
✔️ edpm-ansible-molecule-edpm_kernel SUCCESS in 9m 52s
✔️ edpm-ansible-molecule-edpm_libvirt SUCCESS in 7m 58s
✔️ edpm-ansible-molecule-edpm_nova SUCCESS in 8m 37s
edpm-ansible-molecule-edpm_frr FAILURE in 6m 41s
edpm-ansible-molecule-edpm_iscsid FAILURE in 4m 14s
edpm-ansible-molecule-edpm_ovn_bgp_agent FAILURE in 6m 35s
✔️ edpm-ansible-molecule-edpm_ovs SUCCESS in 12m 18s
✔️ edpm-ansible-molecule-edpm_tripleo_cleanup SUCCESS in 4m 09s
✔️ edpm-ansible-molecule-edpm_tuned SUCCESS in 6m 04s
✔️ edpm-ansible-molecule-edpm_telemetry_power_monitoring SUCCESS in 8m 04s
✔️ edpm-ansible-molecule-edpm_update SUCCESS in 6m 04s
adoption-standalone-to-crc-ceph-provider FAILURE in 2h 41m 16s

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/87ae0cbb50854f54a18570dc772271c1

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 50m 57s
podified-multinode-edpm-deployment-crc FAILURE in 1h 42m 33s
cifmw-crc-podified-edpm-baremetal FAILURE in 1h 46m 20s
edpm-ansible-tempest-multinode POST_FAILURE in 1h 42m 59s
✔️ edpm-ansible-molecule-edpm_bootstrap SUCCESS in 6m 01s
✔️ edpm-ansible-molecule-edpm_podman SUCCESS in 6m 20s
✔️ edpm-ansible-molecule-edpm_module_load SUCCESS in 4m 44s
✔️ edpm-ansible-molecule-edpm_kernel SUCCESS in 10m 16s
✔️ edpm-ansible-molecule-edpm_libvirt SUCCESS in 10m 10s
✔️ edpm-ansible-molecule-edpm_nova SUCCESS in 11m 10s
edpm-ansible-molecule-edpm_frr FAILURE in 6m 56s
edpm-ansible-molecule-edpm_iscsid FAILURE in 4m 32s
edpm-ansible-molecule-edpm_ovn_bgp_agent FAILURE in 6m 57s
✔️ edpm-ansible-molecule-edpm_ovs SUCCESS in 12m 38s
✔️ edpm-ansible-molecule-edpm_tripleo_cleanup SUCCESS in 4m 21s
✔️ edpm-ansible-molecule-edpm_tuned SUCCESS in 6m 10s
✔️ edpm-ansible-molecule-edpm_telemetry_power_monitoring SUCCESS in 8m 11s
✔️ edpm-ansible-molecule-edpm_update SUCCESS in 6m 35s
adoption-standalone-to-crc-ceph-provider FAILURE in 2h 38m 42s

@bshephar bshephar force-pushed the support-bootc branch 12 times, most recently from ffd86e6 to 54f7101 Compare December 2, 2024 01:52
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/deff54e047964013a0c7461f18cfe415

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 58m 14s
podified-multinode-edpm-deployment-crc FAILURE in 1h 42m 46s
cifmw-crc-podified-edpm-baremetal FAILURE in 1h 37m 26s
edpm-ansible-tempest-multinode FAILURE in 1h 48m 44s
✔️ edpm-ansible-molecule-edpm_bootstrap SUCCESS in 5m 44s
✔️ edpm-ansible-molecule-edpm_podman SUCCESS in 6m 07s
✔️ edpm-ansible-molecule-edpm_module_load SUCCESS in 4m 35s
✔️ edpm-ansible-molecule-edpm_kernel SUCCESS in 9m 55s
✔️ edpm-ansible-molecule-edpm_libvirt SUCCESS in 9m 21s
✔️ edpm-ansible-molecule-edpm_nova SUCCESS in 10m 26s
edpm-ansible-molecule-edpm_frr FAILURE in 6m 40s
edpm-ansible-molecule-edpm_iscsid FAILURE in 4m 16s
edpm-ansible-molecule-edpm_ovn_bgp_agent FAILURE in 6m 24s
✔️ edpm-ansible-molecule-edpm_ovs SUCCESS in 12m 31s
✔️ edpm-ansible-molecule-edpm_tripleo_cleanup SUCCESS in 4m 12s
✔️ edpm-ansible-molecule-edpm_tuned SUCCESS in 5m 55s
✔️ edpm-ansible-molecule-edpm_telemetry_power_monitoring SUCCESS in 7m 34s
✔️ edpm-ansible-molecule-edpm_update SUCCESS in 6m 06s
adoption-standalone-to-crc-ceph-provider FAILURE in 2h 46m 37s

{{
ovn_controller_pod_spec | combine({
'spec': {
'containers': ovn_controller_pod_spec.spec.containers | zip_longest([], [{'image': edpm_ovn_controller_agent_image}]) | map('combine') | list,
Copy link
Contributor

@slagle slagle Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that if we were to customize the image like this then by definition the container is no longer "logically bound". Instead, it would be considered a "floating" container per https://containers.github.io/bootc/logically-bound-images.html#comparison-with-default-podman-systemd-units

and also:

There is no mechanism to inject arbitrary arguments to the podman pull (or equivalent) invocation used by bootc.

which seems to apply additional mounts or other options passed to podman pull are not possible.

The dynamically-injected ConfigMaps[1][2] may provide some customization, but that is still not likely for the app container image itself, b/c once that is changed to some other image, then that no longer fits into how logically bound images should be managed with the lifecycle of the base bootc image itself.

Point being...if we choose to allow the ability to podman run any arbitrary image at runtime, then these really aren't logically bound images at all, but are considered "floating".

The question becomes, should we adopt logically bound images, and require our end users to be building new bootc images before deploying EDPM nodes, depending on if they need to customize any of the container images? We could ship a bootc image that had all the images logically bound, but if a user wanted to run a different one (from a partner, etc) they they would need to rebuild that image.

I do like the quadlet/systemd design, and I think we can still adopt that either way.

[1] https://containers.github.io/bootc/building/guidance.html?highlight=configmap#configuration
[2] containers/bootc#22

@bshephar bshephar force-pushed the support-bootc branch 2 times, most recently from a0569c4 to b590229 Compare December 9, 2024 05:38
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/e201c6428c384df8a92084c2694ba93e

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 55m 48s
podified-multinode-edpm-deployment-crc FAILURE in 1h 41m 35s
cifmw-crc-podified-edpm-baremetal FAILURE in 1h 43m 16s
edpm-ansible-tempest-multinode FAILURE in 1h 48m 37s
✔️ edpm-ansible-molecule-edpm_bootstrap SUCCESS in 6m 54s
✔️ edpm-ansible-molecule-edpm_podman SUCCESS in 6m 14s
✔️ edpm-ansible-molecule-edpm_module_load SUCCESS in 4m 49s
✔️ edpm-ansible-molecule-edpm_kernel SUCCESS in 7m 01s
✔️ edpm-ansible-molecule-edpm_libvirt SUCCESS in 9m 38s
✔️ edpm-ansible-molecule-edpm_nova SUCCESS in 10m 45s
edpm-ansible-molecule-edpm_frr FAILURE in 7m 02s
edpm-ansible-molecule-edpm_iscsid FAILURE in 4m 35s
edpm-ansible-molecule-edpm_ovn_bgp_agent FAILURE in 6m 50s
✔️ edpm-ansible-molecule-edpm_ovs SUCCESS in 12m 16s
✔️ edpm-ansible-molecule-edpm_tripleo_cleanup SUCCESS in 4m 06s
✔️ edpm-ansible-molecule-edpm_tuned SUCCESS in 5m 59s
✔️ edpm-ansible-molecule-edpm_telemetry_power_monitoring SUCCESS in 7m 41s
✔️ edpm-ansible-molecule-edpm_update SUCCESS in 6m 14s
adoption-standalone-to-crc-ceph-provider FAILURE in 2h 39m 26s

@slagle
Copy link
Contributor

slagle commented Dec 10, 2024

I created a new bootc branch for edpm-ansible: https://github.com/openstack-k8s-operators/edpm-ansible/tree/bootc

Can you propose this PR to the bootc branch instead?

I reverted #813 from main in #844 I think that was the only other bootc related PR that has merged.

@bshephar bshephar force-pushed the support-bootc branch 4 times, most recently from 1afffaf to 51a8b34 Compare December 11, 2024 00:34
@bshephar bshephar changed the base branch from main to bootc December 11, 2024 01:34

- name: Import packages tasks
ansible.builtin.import_tasks: packages.yml
when: not ansible_local.bootc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to update this change based on the commit I made earlier to handle bootc. packages.yml is already included earlier at line 24 using the other variable I had used "bootc". So that needs to be undone so we can go forward with what you're proposing here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also should we switch this to include_tasks so that the when will skip all the tasks at once instead of individually?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also see that the only reason ansible_local.bootc is set here is because of the task in bootstrap_command.yml to read local facts. I think we need that to be more explicit. Probably add something directly in bootstrap.yml.

I believe this answers my earlier comment on playbooks/bootstrap.yml on how the fact is initially set. We should make it more explicit.

@@ -42,6 +42,7 @@
name: osp.edpm.edpm_kernel
tags:
- edpm_kernel
when: not ansible_local.bootc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is the fact ansible_local.bootc initially gathered given that gather_facts defaults to false in the playbook?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to figure out what we want this to look like still. I have just set gather_facts: true in my deployment Ansible vars.

Maybe we would need to change the default for it to gather_subset: local at a minimum. It's not ideal that we would gather all facts for every service, but at the moment, that's what my gather_facts: true is doing until I come up with something better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we will also need to guard the task Download packages from roles/edpm_download_cache/tasks/main.yml with the fact

to workaround this I dropped download-cache from my NodeSet services

roles/edpm_container_manage/tasks/shutdown.yml Outdated Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like the Insert cronjob in root crontab task requires the cronie rpm. We might need to add that to the image build.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bshephar bshephar force-pushed the support-bootc branch 4 times, most recently from 29ca81f to 8d92a63 Compare January 20, 2025 01:05
This change moves the script we're using for the
logs cronjob into the /var/lib/openstack/cron directory. This facilitates
the bootc immutable filesystem where we can't write to /usr, while also
consolidating scripts relevant to our deployment in a common place.

Signed-off-by: Brendan Shephard <[email protected]>
@bshephar bshephar force-pushed the support-bootc branch 3 times, most recently from a0efece to 94f062b Compare January 20, 2025 01:43
Signed-off-by: Brendan Shephard <[email protected]>
Signed-off-by: Brendan Shephard <[email protected]>
Signed-off-by: Brendan Shephard <[email protected]>
Signed-off-by: Brendan Shephard <[email protected]>
Signed-off-by: Brendan Shephard <[email protected]>
Signed-off-by: Brendan Shephard <[email protected]>
Signed-off-by: Brendan Shephard <[email protected]>
Signed-off-by: Brendan Shephard <[email protected]>
Signed-off-by: Brendan Shephard <[email protected]>
@bshephar bshephar force-pushed the support-bootc branch 2 times, most recently from 0260435 to 0d05701 Compare January 20, 2025 04:06
Signed-off-by: Brendan Shephard <[email protected]>
Signed-off-by: Brendan Shephard <[email protected]>
Signed-off-by: Brendan Shephard <[email protected]>
ansible.builtin.systemd_service:
name: edpm-compute@logrotate_crond
enabled: true
state: started
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to separate out this initial bootc support from the logically bound containers PR, openstack-k8s-operators/edpm-image-builder#39

This PR has a strong dep on the logically bound PR, and that complicates things. Let's just get a base bootc working with how we manage containers presently. We can move to logically bound and all the quadlet/systemd stuff as a next step.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, ok. Let's decouple them. I'll just submit a new PR to edpm-image-builder to change the Quadlet files over to using .container instead of .kube. Then a new one here to work with those instead of the .kube files.

@slagle slagle mentioned this pull request Jan 21, 2025
@bshephar
Copy link
Contributor Author

So, we merged a subset of this in the interest of slimming down the number of required changes. Let's backlog this particular piece of work in favor of getting something demo worthy. We'll collect feedback, and then we can loop back on the logically bound images work if we feel it's still worth pursuing.

@openshift-merge-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@bshephar bshephar closed this Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants