Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NodePool infinitely reconciles #435

Open
aaronschweig opened this issue May 14, 2022 · 2 comments
Open

NodePool infinitely reconciles #435

aaronschweig opened this issue May 14, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@aaronschweig
Copy link

What happened?

Hello!

I was introducing crossplane into my company within the last few weeks. It works great, but one thing I noticed is an Issue with the NodePool CRD.

I created a Cluster resource and NodePool resource simultaneously. This took then a while, which is totally normal. After a few minutes the Cluster CRD reached the ready state and remained in there. So everything was fine with it.
Then the NodePool CRD started to create the Node-Pool inside GCP. But as soon as this was created it immediately started wanting to update it again. This caused it to issue ~3000 UpdateNodePool API-Requests onto GCP during 24h. It would not stop with doing that, even after waiting for a longer time. I was wondering there this behaviour comes from, but I could not find another Issue targeting this problem.

This then had impacts on the cluster availability, as is could never operate problem because it was always in an updating state due to the NodePool constantly updating something.

I expected the CRD to be quiet after it creates the NodePool and only issue UpdateNodePool requests if something diverges from its spec

How can we reproduce it?

I used the following manifests for the Cluster and NodePool:

apiVersion: container.gcp.crossplane.io/v1beta2
kind: Cluster
metadata:
  name: test-cluster
spec:
  deletionPolicy: Delete
  forProvider:      
    location: europe-west3-a
    network: default
    maintenancePolicy:
      window:
        dailyMaintenanceWindow:
          # this is GMT, which means, that the actual window start at 23:00
          startTime: 21:00
    verticalPodAutoscaling:
      enabled: true
    addonsConfig:
      gcePersistentDiskCsiDriverConfig:
        enabled: true
    initialClusterVersion: latest
    releaseChannel:
      channel: STABLE
apiVersion: container.gcp.crossplane.io/v1beta1
kind: NodePool
metadata:
  name: test-cluster-pool
spec:
  deletionPolicy: Orphan
  forProvider:
    autoscaling:
      minNodeCount: 3
      maxNodeCount: 5
      autoprovisioned: false
      enabled: true
    clusterRef:
      name: test-cluster
    config:
      machineType: e2-small
      diskSizeGb: 10
      imageType: cos_containerd
      oauthScopes:
        - "https://www.googleapis.com/auth/devstorage.read_only"
        - "https://www.googleapis.com/auth/compute"
    upgradeSettings:
      maxSurge: 1
      maxUnavailable: 1
    initialNodeCount: 3
    locations:
      - europe-west3-a

You can then see the NodePool resource starting to try to issue update requests. I monitored that behaviour with looking into the audit logs of the used service account.

What environment did it happen in?

Crossplane version: v0.21.0

Thank you for your help and please let me know if I need to provide any more insides into the issue!

@aaronschweig aaronschweig added the bug Something isn't working label May 14, 2022
@jesumyip
Copy link

Is there a fix for this?

@aaronschweig
Copy link
Author

Is there a fix for this?

Unfortunately not. As a workaround I made sure to change my deletion policy to rentain and deleted the NodePool Ressource after everything was created. This is obviously failing to achieve the benefit of having it as a CRD so I would be happy to see if this issue could be resolved somehow. I also didn't try again with newer version of crossplane, but it might still be the case that the described behavior occurs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants