Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable cilium CNI option #72

Merged

Conversation

carpenike
Copy link
Member

@carpenike carpenike commented Oct 5, 2020

Description

This is opinionated a bit, feel free to modify as you see fit or ignore altogether. :)

This enables the use of CNI within cluster along with an option to disable kube-proxy. Kube-proxy is enabled by default and CIlium is not the default CNI.

(Opinionated part): This also installs kube-router in the event that Cilium is leveraged and requires BGP with external service.

Type Of Change

To help us figure out who should review this PR, please put an X in all the areas that this PR affects.

  • Docs
  • Installation
  • Performance and Scalability
  • Security
  • Test and/or Release
  • User Experience

Issue Ref (Optional)

Notes

Partially addresses #68

Signed-off-by: Ryan Holt <[email protected]>
Signed-off-by: Ryan Holt <[email protected]>
bgp_peer_asn: 64512
cilium_helm_version: 1.8.3
cilium_image_version: v1.8.3
k8s_service_host: "{{ hostvars[groups['masters'][0]]['ansible_host'] }}"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RobReus -- This really should be VIP IP if it's presented, I'm at the edge of my Ansible skills.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when using keepalived this is associated with keepalived_vip, but i don't know how running cilium affects running keepalived?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

theoretically it should be outside of Cilium as it's a host level config, right?

However, this would need to be an either or I think; if VIP present use that, otherwise use the master IP?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i think you're right. it should be an either/or situation. i'm not sure if something could be achieved with jinja logic, but my assumption is that it would be possible by using some jinja operators on the template, in addition to some sanity checking for each var.

@carpenike
Copy link
Member Author

@xunholy not sure why these are failing on files I didn't touch.

Copy link
Member

@xunholy xunholy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can ignore the errors that didn't originate from your changes but if we can address the ones we have that would be good!!

@@ -13,3 +13,4 @@ cluster_pod_subnet: ""
# Default etcd values, change these if you experience "leader changed" issues when running on a SD card
cluster_etcd_heartbeat_interval: 100
cluster_etcd_election_timeout: 1000
kube_proxy: enabled
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to have a comment in here for users to roughly know why you'd have this disabled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm unfamiliar with the requirements for cilium, so i am under the assumption that this is a requirement for it. if that is the case, then this var should be scoped as such. explicitly setting this var for later use could be misleading to some users. having this var set based on what CNI you're hoping to deploy would be the ideal situation, likely from a conditional import vars based on CNI selection from the user.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a requirement, but it would modify the installation options for cilium if you're not using kube-proxy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still prefer that the variable be scoped, or at the very least some comments be added to detail what you're enabling/disabling.

ansible/roles/cluster/tasks/initialize.yml Show resolved Hide resolved
Comment on lines 27 to 29
--set global.hubble.enabled="true" \
--set global.hubble.ui.enabled="true" \
--set global.hubble.relay.enabled="true" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there is an easier way to store these values to allow them to be easier to configure for users, for example, a user using arm might not be able to enable hubble right now. Would it make sense if we stored these in a values.yaml under files/ and we copy that and just --values=values.yaml on the host, this is just a thought?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah we could template out a cilium-values.yaml file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, if we can template out a file that users can use to edit helm values that would be perfect!

@xunholy xunholy requested review from RobReus and crutonjohn October 6, 2020 01:41
Copy link
Member

@crutonjohn crutonjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is looking great. glad to be able to support more use cases and implementations! thanks @carpenike

ansible/roles/cni/tasks/cilium.yml Outdated Show resolved Hide resolved
@@ -13,3 +13,4 @@ cluster_pod_subnet: ""
# Default etcd values, change these if you experience "leader changed" issues when running on a SD card
cluster_etcd_heartbeat_interval: 100
cluster_etcd_election_timeout: 1000
kube_proxy: enabled
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm unfamiliar with the requirements for cilium, so i am under the assumption that this is a requirement for it. if that is the case, then this var should be scoped as such. explicitly setting this var for later use could be misleading to some users. having this var set based on what CNI you're hoping to deploy would be the ideal situation, likely from a conditional import vars based on CNI selection from the user.

Copy link
Member

@crutonjohn crutonjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good. Just need to resolve the one issue with the k8s_service_host var for cilium.

@xunholy xunholy force-pushed the add-cilium-cni-with-kube-proxy-option branch from e6ed4c0 to 645377e Compare November 29, 2020 21:35
@xunholy xunholy requested review from RobReus and removed request for RobReus December 6, 2020 10:13
@xunholy xunholy changed the base branch from master to main December 6, 2020 10:18
@anthr76
Copy link
Member

anthr76 commented Dec 18, 2020

We should probably use the generic package module to see if helm is already present.

Considering for kubeadm manifest generation we use cluster_controlplane_endpoint shouldn't we use that over the IP of the first master?

controlPlaneEndpoint: {{ cluster_control_plane_endpoint }}

For:

k8s_service_host

https://github.com/carpenike/k8s-cluster-installation/blob/c3c2e1af8004bb91630b687dc0fae76c54f36e89/ansible/roles/cni/templates/values.yaml.j2#L101

Otherwise this looks pretty good. Going to do some testing later

anthr76 added a commit to anthr76/infra that referenced this pull request Dec 18, 2020
My modifcations to pending PR are yet to be pushed
(raspbernetes/k8s-cluster-installation#72).

* Ansible (like always) needs to be cleaned up
* Cilium + Kuberouter
* Etcd/Controlplane is now kubeadm managed
* Add new certs/creds

Signed-off-by: anthr76 <[email protected]>
Opt to use newer offical image rather then 3rd party source.

Signed-off-by: anthr76 <[email protected]>
Also update inventory source and delegate to localhost.. also for now.

Change galaxy collection to new name.

https://github.com/ansible-collections/community.kubernetes/issues/221

Signed-off-by: anthr76 <[email protected]>
This is required to leverage Kubernetes modules from ansible. Hopefully
this changes in the future.

According to git issues a .deb is planned. A RPM is available. There is
a package in the AUR though opted to install with PIP instead because I
would trust it better without testing.

Signed-off-by: anthr76 <[email protected]>
Set in a more elegant multi-os manner later.

Signed-off-by: anthr76 <[email protected]>
@carpenike
Copy link
Member Author

instead of using helm on the node itself I'm opting for helm to be a requirement on the host itself. I would enjoy some input from others on this topic.

I think this would also require you to pull out the kubeconfig off of the master and put it on your local host. Part of doing it on the master node was to follow a pattern in place for other one-off kube API related changes / settings.

@anthr76
Copy link
Member

anthr76 commented Jan 25, 2021

Yeah considering that the rest of the roles are configured to interact with the kubectl API on the main master that's what I ended up doing. It's now making the helm changes with the first master node.

Though you can just grab the output of the kubeconfig from cluster creation I think what you suggested is the best bet. It's really a matter of where you want to add the dependencies. I appreciate the input!

@anthr76 anthr76 force-pushed the add-cilium-cni-with-kube-proxy-option branch from 902e8fa to a8ebde2 Compare January 25, 2021 22:26
@anthr76 anthr76 removed their assignment Jan 25, 2021
This was dropped due to the latest merge and made it into the working
tree

raspbernetes#84

Signed-off-by: anthr76 <[email protected]>
Disable kube-proxy by default for cilium. Revise var handling.

Signed-off-by: anthr76 <[email protected]>
Signed-off-by: anthr76 <[email protected]>
* Revise requirements to correctly point to modules.
* Disable hubble by default. Users can, and probably should do this on
their
[own](https://docs.cilium.io/en/v1.9/gettingstarted/k8s-install-kubeadm/#enable-hubble-for-cluster-wide-visibility)
in the meantime we'll not support but leave the option there.
* Add sysctl's for present systemd bug - cilium/cilium#10645

Signed-off-by: anthr76 <[email protected]>
@anthr76
Copy link
Member

anthr76 commented Jan 27, 2021

I think after review this PR can be considered ready to be merged

# cluster_etcd_heartbeat_interval: 100
# cluster_etcd_election_timeout: 1000
# cluster_kube_proxy_enabled: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this value twice

@@ -0,0 +1,3 @@
collections:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we prefix this file with ---

* Added var twice
* `---`

Signed-off-by: anthr76 <[email protected]>
@xunholy xunholy merged commit d162e72 into raspbernetes:main Jan 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants