-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use specific client set for operations #105
base: master
Are you sure you want to change the base?
Conversation
This PR add usage of specific client set for specific operation. We now use client set built from kubeconfig parameter only for creation and watching of the experiment of the pod. And we use client set built from litmuskubeconfig for watching/updating chaosengine pod, which may run on different cluster. Signed-off-by: Ondra Machacek <[email protected]>
flag.Parse() | ||
// Use in-cluster config if kubeconfig path is specified | ||
if *litmuskubeconfig == "" { | ||
configLitmus, err := rest.InClusterConfig() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this not be configExperiment ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually no, whatever is specified by litmuskubeconfig
parameter is then defined in configLitmus
variable. And whatever is specified in kubeconfig
parameter is then defined in configExperiment
variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok got it. Could we name this better. Can we say litmusControlPlaneKubeConfig
as the one which deals w/ chaos-operator/runner/experimentCR/engineCR & targetKubeConfig
as the one that deals w/ the target application & also the experiment job/helper pod & chaosresult resources (note that chaosresult CR is created by the experiment and thereby will reside in the target cluster) .. the logic to read chaosResults and patch to enginine may need to be looked at
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
// Use in-cluster config if kubeconfig path is specified | ||
if *kubeconfig == "" { | ||
config, err := rest.InClusterConfig() | ||
configExperiment, err := rest.InClusterConfig() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this, the configLitmus?
kubeconfig := flag.String("kubeconfig", "", "absolute path to the kubeconfig file") | ||
litmuskubeconfig := flag.String("litmuskubeconfig", "", "absolute path to the kubeconfig file") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose we can also distinguish the usage/help message to specify:
litmuskubeconfig - "absolute path to the kubeconfig file of target cluster where experiment job is launched"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ref: #105 (comment)
@@ -116,7 +116,7 @@ func BuildingAndLaunchJob(experiment *ExperimentDetails, clients ClientSets) err | |||
|
|||
// launchJob spawn a kubernetes Job using the job Object received. | |||
func (experiment *ExperimentDetails) launchJob(job *batchv1.Job, clients ClientSets) error { | |||
_, err := clients.KubeClient.BatchV1().Jobs(experiment.Namespace).Create(job) | |||
_, err := clients.KubeClientExperiment.BatchV1().Jobs(experiment.Namespace).Create(job) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just trying to understand which config is used for which purpose.
- kubeconfig & KubeClientExperiment for ChaosEngine, ChaosExperiment, ChaosRunner, ChaosOperator
- litmuskubeconfig & k8sClientSetLitmus for TargetApplication, ExperimentJob/Pod, ChaosResult
right? if this is the case, would you like to use theclients.KubeClientExperiment
to get the chaosresult verdict here?
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I was thinking having the chausresult used by kubeconfig. Do you think it should be used by litmuskubeconfig?
Also, just thinking out load - on the impacts/benefits (usecases) of this change. Today, the chaos resources (CRs) and the associated pod/job resources are tightly tied together.
This would mean playing/passing around multiple configs across components. While feasible, was wondering if this is really needed? @machacekondra , there is a feature in the (up and coming) litmus-portal which allows you to connect an "external cluster" and execute chaos against it via an agent (that is auto deployed when this 'cluster connect' happens). The target/external cluster also has the entire set of litmus components installed into it. The only difference is in case of the portal we run chaos workflows (Argo workflows that embed the engine spec) rather than the engine directly. Do you think this satisfies your usecase? Here is a ref to the userguide: https://docs.google.com/document/d/1fiN25BrZpvqg0UkBCuqQBE7Mx8BwDGC8ss2j2oXkZNA/edit#heading=h.solcj2aoqyws |
Adding to @ksatchit comments, Few more complexities can arise if we will use different kubeconfig for runner & experiment-pod. In the case of node attributes/operations i.e, node-selector, node-taint, etc. We are bounding those pod to the same node so that they will remain unaffected in case of node-level chaos. The separate cluster/kubeconfig may have different node-name/labels which can result to failure for those cases. |
Just went quickly throught the targeted cluster doc and it should solve the use case. I will try it. |
@ispeakc0de @ksatchit Is there any plan to have some API for the |
@machacekondra that's a really good point. IMO, we should definitely have well structured and documented API for this - & hope this is already being thought of actively. Tagging @gdsoumya and @rajdas98 to respond. |
@machacekondra we have graphql apis exposed, though they are currently undocumented for public use. It should be possible to interact with the server directly using those. We can give a quick walkthrough if you want to try it immediately, we will document the APIs soon so that it can be used directly. |
Well I find it quite not easy to use the multiple clusters from the portal, so I would welcome if it could be somehow possible via multiple kube config files. Other approach would be to have only special kubeconfig file for the helper pod, if that could somehow solve the gaps. What do you think? |
Hey @machacekondra , sorry to hear you are not finding the portal convenient - hopefully this is addressed in the upcoming builds. Having said that launching helpers alone (and maybe any other specific actions on the target clusters - via a different kubeconfig) on a case-by-case basis does seem like a better option than changing it in chaos-runner. Makes things simpler/less-complicated by one level. I would like to hear @ispeakc0de 's views on this too |
Hi @machacekondra, can you share your concerns or issues that you might be facing while trying to use the portal? Once we have the api doc public it should be possible to automate the external cluster connects and also the chaos workflow deployments. If you want to try the api way of doing things we can give you a rough documentation for now. |
@gdsoumya Well, I think then you are locked to the litmus-portal usage only with chaos workflows. Users can't then simply use ChaosEngine etc. am I correct? |
|
Hi @machacekondra yes you are right, you cannot directly use chaosengines but you can surely use them inside your chaos-workflow. If your goal is just to run an experiment you can run a workflow with only one step that is executing a chaosengine. Yes the above flow should be possible with the apis, as I mentioned our apis are graphql apis so you can use any graphql client of your choice in your programming language or even craft http requests to perform the gql operations directly. Once you setup your protal(login and setup the admin project) and have an exposed address for the backend server, you need to first get a auth token from the auth server using an api call and then use that token as an auth header in your gql queries. That's the only extra step you need to perform. After that there is basically only 2 operations or api calls that you need to make - 1st one is to get the manifest to connect the external cluster and second api call is to create the workflow that will be executed. If you want we can write a sample script/program in go to show how you can do the above using the apis through http requests. |
This PR add usage of specific client set for specific operation.
We now use client set built from kubeconfig parameter only for creation
and watching of the experiment of the pod.
And we use client set built from litmuskubeconfig for watching/updating
chaosengine pod, which may run on different cluster.
Signed-off-by: Ondra Machacek [email protected]
What this PR does / why we need it:
Which issue this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close that issue when PR gets merged): fixes #Special notes for your reviewer:
Checklist:
documentation
tagbreaking-changes
tagrequires-upgrade
tag