Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support for anti-affinity/affinity rules for the created machines #175

Closed
sidharthsurana opened this issue Jan 22, 2019 · 35 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.

Comments

@sidharthsurana
Copy link
Contributor

vsphere DRS supports defining anti-affinity / affinity rules for VMs. This feature is to add support for the user to specify affinity/anti-affinity grouping for the VMs.

Use case: User created 3 Machine object, and want all 3 VMs to run on different hosts to improve resiliency from host failures. This can easily be realized by creating the anti-affinity rule for the 3 VMs

@frapposelli
Copy link
Member

/kind feature
/priority important-longterm
/assign @sidharthsurana

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Feb 1, 2019
@frapposelli frapposelli added this to the Next milestone Feb 1, 2019
@sflxn sflxn modified the milestones: Next, v1alpha1 Feb 1, 2019
@sflxn sflxn modified the milestones: v1alpha1, Next Mar 6, 2019
@sflxn
Copy link

sflxn commented Mar 6, 2019

This potentially can get done for v1alpha1, but it isn't a feature that's absolutely needs to go in for this release. If it gets done in time and the PR has been reviewed, we can make a judgement call at that time.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 4, 2019
@moshloop
Copy link

moshloop commented Jun 7, 2019

Any reason why this can't be supported without DRS? Not sure DRS works across multiple clusters, and scheduling aross multiple clusters in a vcenter seems like a reasonable thing todo?

@akutz
Copy link
Contributor

akutz commented Jun 7, 2019

This may need to be punted to v1alpha2. If not we need to resource this work yesterday.

cc @frapposelli

@frapposelli
Copy link
Member

@moshloop affinity/anti-affinity rules are defined within a DRS-enabled cluster, a cluster is also a fault-domain boundary for vSphere. Deploying across multiple clusters should be possible today using MachineSets.

@akutz this is potentially a noop (see kubernetes/cloud-provider-vsphere#179), but even if not, definitely needs to be punted to v1alpha2.

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 7, 2019
@moshloop
Copy link

moshloop commented Jul 8, 2019

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jul 8, 2019
@sujeet-banerjee
Copy link

Any reason why this can't be supported without DRS? Not sure DRS works across multiple clusters, and scheduling across multiple clusters in a vcenter seems like a reasonable thing todo?

Short Answer: No. Without-DRS, it won't be effective.

Long Answer:
#1 One may use VM-Host rules without the "DRS enabled" on a vShpere cluster. However, the worker nodes will not be auto-migrated (or vMotioned) to support a balanced spread of the nodes across Hosts (i.e. HA in true sense). And CRDs do not expose Host details.
#2 That also means "vMotion" must be enabled for the Hosts.
More Details: https://docs.vmware.com/en/VMware-vSphere/6.7/com.vmware.vsphere.resmgmt.doc/GUID-7D3ABD21-4524-42E9-B7FE-6AAF6766433B.html

I have been working on a design-proposal for supporting Antiafinity/Affinity. Attached.
Spec_changes_for_AntiAffinity.docx
Test_n_Demo.pdf

@moshloop
Copy link

moshloop commented Aug 6, 2019

@sujeet-banerjee Can you create a google doc for commenting and review?

vMotion is something I believe should be turned off for kubernetes clusters as it conflicts with kubernetes view of the system. If a host fails then kubernetes should reschedule the pods that become not ready, cluster-api machineset should then detect unresponsive nodes, and create new ones.

Same applies to DRS, it should be turned off and kubernetes native components like descheduler used to move and rebalance workloads.

Without DRS or vMotion, machine affinity/antiAffinity can easily be implemented when creating a machine by listing all available hosts/clusters and running through rules list in the same way pod affinity is implemented.

@andrewsykim
Copy link
Member

Can you create a google doc for commenting and review?

+1 to this, please use Google docs so we can comment/review :)

@sujeet-banerjee
Copy link

sujeet-banerjee commented Aug 7, 2019

use Google Docs...

Could you point me to the shared location where I could add the doc?

@moshloop

Without DRS or vMotion, machine affinity/antiAffinity can easily be implemented when creating a machine by listing all available hosts/clusters and running through rules list in the same way pod affinity is implemented.

I somehow disagree on a few things,
#1 in my opinion, it's not a good idea to tie/specify host-details in the affinity-definition (machine/machine-set CRDs). As I understand, ESXi hosts may be added/taken-down at the will of the end-users, within a vSphere cluster.
#2 Similarly, Enabling/Disabling DRS should be end-users' will. An end-user may want to use DRS features, and CAPI/k8s should not be restrictive against using DRS features.

@sujeet-banerjee
Copy link

Proposal Doc: https://docs.google.com/document/d/1fNm53l5K0OfPrGc3zhjNDYzVQNBEc85uEJfNwtiDQrY

I would invite folks to review the proposal and add suggestions (if any).

Thanks,
Sujeet

@davidopp
Copy link

davidopp commented Aug 8, 2019

I believe
kubernetes/enhancements#997
kubernetes/enhancements#1127
are related?

@brysonshepherd
Copy link

@moshloop
Wouldn't it be better to vmotion, rather than build a whole new node on another vhost? To me that would lead longer time with pending pods.

@moshloop
Copy link

moshloop commented Aug 8, 2019

@brysonshepherd vMotion requires shared storage which is really expensive, slow and unreliable (when compared to just starting a new vm)

@moshloop
Copy link

moshloop commented Aug 8, 2019

, it's not a good idea to tie/specify host-details in the affinity-definition

antiAffinity using a topology key would work as new hosts, clusters and datancenters are added and removed:

e.g.

    vmAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution: 
            topologyKey: "vmware.io/hostname"

and to distribute nodes across multiple clusters

    vmAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution: 
            topologyKey: "vmware.io/clustername"

@brysonshepherd
Copy link

brysonshepherd commented Aug 8, 2019

@moshloop

Some users may pay for that higher level of service for higher uptime.

In my organizations current use case, drs and vmotion are needed.

@moshloop
Copy link

moshloop commented Aug 8, 2019

higher uptime

Shared storage is much more likely to reduce uptime than it is to increase it, especially when your application is stateless and doesn't need persistent storage. Even when an application requires clustered or replicated storage, the control plane (kubernetes) should not share the same shared fault domain.

vMotion has it's advantages and is often the easiest solution from an operational overhead, but if you want to achieve the highest levels of resilience and uptime your kubernetes nodes shouldn't be using it.

@brysonshepherd
Copy link

@moshloop I'm not saying that vmotion is what I want, it is just faster than making a new node. If making a new node (along with rescheduling/starting up the pods) was faster, then that is what I would do. But it isn't. At least not that I'm aware of.

@moshloop
Copy link

moshloop commented Aug 8, 2019

The time to recover a pod doesn't need to be instant, just reasonable - You should be running multiple pods with enough headroom to survive the loss of one pod, If you lose a physical ESXi host than the VM doesn't get vMotion'ed, it gets booted on a different host, potentially goes through disk crash recovery and rejoins the cluster - in the meantime the pods running on the node may have already been detected as down and rescheduled.

@sujeet-banerjee
Copy link

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 19, 2019
@akutz
Copy link
Contributor

akutz commented Dec 17, 2019

ping @pdaigle

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 16, 2020
@moshloop
Copy link

/remove-lifecycle stale

@moshloop
Copy link

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 17, 2020
jayunit100 pushed a commit to jayunit100/cluster-api-provider-vsphere that referenced this issue Feb 26, 2020
- Set keyname on instances
- Better handle certificate missing from machine status in GetKubeConfig
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 16, 2020
@yastij yastij removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 16, 2020
@vincepri
Copy link
Member

/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Apr 16, 2020
@jayunit100
Copy link
Contributor

naively looking at this, and wondering,

Since @srm09 added https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/pull/1182/files , would it be possibly to hijack the VerifyAffinityRule(ctx computeClusterContext, clusterName, hostGroupName, vmGroupName string) (Rule, error) { somehow, to do some kind of affinity/anti-affinity VM Spreading at a non-regional, but rather at just an ESXi'ish level ?

@srm09
Copy link
Contributor

srm09 commented Jan 30, 2022

@jayunit100 I am not sure I quite understand what you are looking for here? Currently multi AZ allows you to attach VMs to specific hosts, if the host groups are setup to point to a particular host.
Can you elaborate on this ask?

@vincepri vincepri removed the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Jan 31, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 1, 2022
@srm09
Copy link
Contributor

srm09 commented May 10, 2022

/close
This has been implemented as the multi AZ feature using VSphereDeploymnetZone and VSphereFailureDomain CRDs.

@k8s-ci-robot
Copy link
Contributor

@srm09: Closing this issue.

In response to this:

/close
This has been implemented as the multi AZ feature using VSphereDeploymnetZone and VSphereFailureDomain CRDs.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sujeet-banerjee
Copy link

Does VSphereDeploymnetZone tie the machines (VMs) to specific ESXi hosts? As I understand, ESXi hosts may be added/taken down at the will of the end-users, within a vSphere cluster. Not sure if it's a good idea to tie/specify host details in the affinity definition (machine/machine-set CRDs).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests