Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[envTest] Feature request: Enable kube-controller-manager #3083

Open
migueleliasweb opened this issue Jan 21, 2025 · 11 comments
Open

[envTest] Feature request: Enable kube-controller-manager #3083

migueleliasweb opened this issue Jan 21, 2025 · 11 comments

Comments

@migueleliasweb
Copy link

Hi all,

the fact that kube-controller-manager is missing from envTest means that some of the most core things in Kubernetes are basically impossible to test. Things like:

  • Namespace termination
  • Garbage collection of owned resources
  • Core finalizers (will get stuck)

Here are some of the previously opened issues related to the missing kube-controller-manager in envTest:

Personally, I'm really keen on having the garbage collection working. Without it, one would need to add some not-nice hacks to envtest to pretend things are working.

This forces users to perform E2E tests on a "real cluster" (something like kind) but then the main issue becomes the substantially longer turnaround time and the inability to easily debug tests with breakpoints (although still possible, it's a pain to setup).

If this capability is intentionally disabled and is not planned to be enabled, I'd wish this to be documented so then users won't come and ask for it on similar issues again. Other than that, if this could be worked out, it would be AMAZING.

Thanks in advance!

@migueleliasweb
Copy link
Author

migueleliasweb commented Jan 21, 2025

I understand this can become a slippery slope but if a concious decision is the made so then no "runtime-bound" controllers are enabled like the deployment-controller (we don't really want pods running for real, you know) and only the satellite controllers like namespace-controller and garbage-collector-controller are, this would already help heaps!

Another possibility is to document the use of UseExistingCluster. Maybe this is also a viable alternative to get more fully fleged clusters access for testing. This option is not very well documented. It would be great to get more info about that too.

EnvTest "UseExistingCluster": https://github.com/kubernetes-sigs/controller-runtime/blob/v0.20.0/pkg/envtest/server.go#L157
Reference: https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller-manager/

@sbueringer
Copy link
Member

I think this would be a nice thing to have. I'm not sure how hard is to implement (or if there are any showstoppers).

In general, it should be possible to use "UseExistingCluster" with kind.

@alvaroaleman
Copy link
Member

My take here is that if you want a functional cluster, you should be using kind. Maybe we can try to make it easy to use kind from envtest?

I am not a fan of teaching envtest how to setup a controller-manager (and soon after a scheduler, because that will be the next ask) because then we have to deal with keeping its config up to date with upstream changes and generally functioning, but kind already does all of that and likely better than we would

@sbueringer
Copy link
Member

Technically you could call kind as a library to create a kind cluster. We do this in CAPI. But I don't want a Go dependency to kind in the CR go module

@migueleliasweb
Copy link
Author

Technically you could call kind as a library to create a kind cluster. We do this in CAPI. But I don't want a Go dependency to kind in the CR go module

I was browsing yesterday and found that the E2E Framework worked around this by wrapping kind on a subcommand. See: https://github.com/kubernetes-sigs/e2e-framework/blob/main/third_party/kind/kind.go.

That got me thinking that the E2E framework would be more suitable for this kind of setup. I would still miss the simplicity of envTest. Not having to build a container, load the container, run the container for real, install cert manager, prometheus, etc, etc, etc, just to run 10 tests that finish in 20s 🙄 but the whole bootstrapping process takes 5m.

I'd be happy if some more examples and docs could be made to provide clearer path to use the UseExistingCluster option with KinD. I was digging yesterday on the code and it seems like as long as the call for client.GetConfig() (from CR) works, envTest is happy to pretend the cluster is up and skip all of the bootstrapping logic. Is that somewhat correct?

Providing an example on the docs on how one could use kind programmatically would be also very beneficial.

@migueleliasweb
Copy link
Author

migueleliasweb commented Jan 22, 2025

I've got a snippet on how to control kind via Golang that can help setting up envTest with UseExistingCluster:

package utils

import (
	"errors"
	"os"

	"k8s.io/client-go/rest"
	"k8s.io/client-go/tools/clientcmd"
	kindapidefaults "sigs.k8s.io/kind/pkg/apis/config/defaults"
	kindapiv1alpha4 "sigs.k8s.io/kind/pkg/apis/config/v1alpha4"
	kindcluster "sigs.k8s.io/kind/pkg/cluster"
	kindcmd "sigs.k8s.io/kind/pkg/cmd"
	"sigs.k8s.io/yaml"
)

var kindDockerProvider = kindcluster.NewProvider(
	kindcluster.ProviderWithDocker(),
	kindcluster.ProviderWithLogger(
		kindcmd.NewLogger(),
	),
)

func SetupKindCluster(
	clusterName string,
) error {
	activeClusters, err := kindDockerProvider.List()

	if err != nil {
		return err
	}

	for _, activeClusterName := range activeClusters {
		if activeClusterName == clusterName {
			// nothing to do
			return nil
		}
	}

	clusterConfig := &kindapiv1alpha4.Cluster{
		TypeMeta: kindapiv1alpha4.TypeMeta{
			APIVersion: "kind.x-k8s.io/v1alpha4",
			Kind:       "Cluster",
		},
		Nodes: []kindapiv1alpha4.Node{
			{
				Role:  kindapiv1alpha4.ControlPlaneRole,
				Image: kindapidefaults.Image,
			},
			{
				Role:  kindapiv1alpha4.WorkerRole,
				Image: kindapidefaults.Image,
			},
			{
				Role:  kindapiv1alpha4.WorkerRole,
				Image: kindapidefaults.Image,
			},
		},
	}

	kindapiv1alpha4.SetDefaultsCluster(clusterConfig)

	yamlBytes, err := yaml.Marshal(clusterConfig)

	if err != nil {
		return err
	}

	tmpFile, err := os.CreateTemp(
		os.TempDir(),
		"kind-config-*.yaml",
	)

	if err != nil {
		return err
	}

	bytesWritten, err := tmpFile.Write(yamlBytes)

	if bytesWritten != len(yamlBytes) {
		return errors.New("failed to write the expected number of bytes to kind cluster config")
	}

	if err != nil {
		return err
	}

	if err := tmpFile.Sync(); err != nil {
		return err
	}

	if err := kindDockerProvider.Create(
		clusterName,
		kindcluster.CreateWithConfigFile(tmpFile.Name()),
	); err != nil {
		return err
	}

	return nil
}

func GetRestConfigForKindCluster(
	clusterName string,
) (*rest.Config, error) {
	kubeConfigStr, err := kindDockerProvider.KubeConfig(
		clusterName,
		false,
	)

	if err != nil {
		return nil, err
	}

	return clientcmd.RESTConfigFromKubeConfig(
		[]byte(kubeConfigStr),
	)
}

You can feed this to:

testEnv := &envtest.Environment{
    Config:                restConfig,
    UseExistingCluster: ptr.To(true),
}

@migueleliasweb
Copy link
Author

migueleliasweb commented Jan 22, 2025

I'm still having some issues with Webhooks now refusing connections, thus causing errors in all of my tests.

 dial tcp 127.0.0.1:33943: connect: connection refused

I've checked the ports and they are the correct ones, dynamically configured by envTest itself. I have a feeling this has something to do with the certs used to configure the Webhooks. Since now there is a real cluster with a real API Server running, there might be some extra bits that need to be configured.

What I've got working previously with envTest without usingUseExistingCluster:

mgr, err := ctrl.NewManager(envTestReturn.RestConfig, ctrl.Options{
	WebhookServer: webhook.NewServer(webhook.Options{
		Host: envTest.WebhookInstallOptions.LocalServingHost,
		Port: envTest.WebhookInstallOptions.LocalServingPort,
		CertDir: envTest.WebhookInstallOptions.LocalServingCertDir,
	}),
})

I'm not sure if anything different needs to be done when using an external cluster.

Removing the registration of the webhooks, all tests pass again.

Webhooks can be made to work if you override LocalServingHost in your WebhookInstallOptions to point to the address of the gateway used by the docker network that kind created. In my case, 172.19.0.1.

@sbueringer
Copy link
Member

@migueleliasweb I think we made this work in Cluster API (not sure if I remember correctly it was a while ago)

Can you please check if there is something useful here? https://github.com/kubernetes-sigs/cluster-api/blob/f9cd33fa58926b73cb31beb335c75a41c80e4181/internal/test/envtest/environment.go#L279-L286

@migueleliasweb
Copy link
Author

migueleliasweb commented Jan 22, 2025

@migueleliasweb I think we made this work in Cluster API (not sure if I remember correctly it was a while ago)

Can you please check if there is something useful here? https://github.com/kubernetes-sigs/cluster-api/blob/f9cd33fa58926b73cb31beb335c75a41c80e4181/internal/test/envtest/environment.go#L279-L286

In my case, I was forced to use my Docker Gateway address so then kind that is running on it's own docker network can reach back to my controller runtime manager that is running on a separate container running inside VSCode's devcontainer (which runs with network=host). 😂 It's a bit all over the place. I tried 0.0.0.0 but couldn't get it to work.

@troy0820
Copy link
Member

My question around this is the node image that kind uses for this? That seems like if embedded with envtest it will cause a lot of grief trying to deal with older clusters who want to use the goodness of what envtest offers. I am with @sbueringer and @alvaroaleman around this part:

My take here is that if you want a functional cluster, you should be using kind. Maybe we can try to make it easy to use kind from envtest?

I am not a fan of teaching envtest how to setup a controller-manager (and soon after a scheduler, because that will be the next ask) because then we have to deal with keeping its config up to date with upstream changes and generally functioning, but kind already does all of that and likely better than we would

Technically you could call kind as a library to create a kind cluster. We do this in CAPI. But I don't want a Go dependency to kind in the CR go module

and also agree the use of the existing cluster may be sufficient to fulfill the needs of the k8s components that kind already offers.

@sbueringer
Copy link
Member

sbueringer commented Jan 23, 2025

I think we wouldn't embed the kind node image in envtest.

If we would add kube-controller-manager to envtest, we would just include the kube-controller-manager binary like we include the kube-apiserver binary today (see the release attachments here: https://github.com/kubernetes-sigs/controller-tools/releases/tag/envtest-v1.32.0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants