-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage-check-pvc-init-job constantly failing #14
Comments
👋 Welcome to Kuberhealthy Storage Check. Thanks for opening your first issue. |
Can you give me more details the environment you’re running in? This was running successfully in a VMware env but everyone does things slightly differently. Thanks! |
Also did you give the service the proper role to create the storage? That is critical and may be what you’re running into. If you look in the deploy directory you’ll need to make sure the service account name, storage-sa, in this case, has permissions (proper role and role binding) to creat the storage. Let me know if that helps any. |
@ChrisHirsch what kind of info do you need? There are a lot of things I could tell you about our environment 👍 There is no problem to create PV (storage), but to write something there, when it is created and connected to POD. We have no restrictions for deplyoments/pods to create storage and work with it. It is strange that pod would be able to create storage object (PV), but he would no longer have the rights to write to it. We have no restrictions for storage class, if something wants to use storage class, create PV and mount it. |
Can you by chance drop in the logs from the pod for the storage-check? Hopefully that will shed some light. Obviously this hasn't seen many environments...yet...but I do feel that this should be storage agnostic as it simply provisions storage from the SC and then creates a file on the PVC and then shares the around to the various nodes. Of course I'm sure I'll be proven wrong and probably have made some assumptions that are not necessarily generic and probably what you're running into. Thanks for your patience! |
@ChrisHirsch sorry, I missed you comment. Only log I got from that POD was the one above /bin/sh: 1: cannot create /data/index.html: Permission denied. |
We're hitting this as well on AKS. The same /bin/sh: 1: cannot create /data/index.html: Permission denied. pops out on some of the storage classes we're testing (managed-premium and default).
k logs storage-check-default-1620284224 -n synthetic
k logs storage-check-pvc-default-init-job-wfrnz -n synthetic |
Can we reopen? |
Sure...so can I get some additional information on these environments? If it's easier we can also chat in #kuberhealthy too. |
Sure; what kind of information could help?
|
So, I dig around and found that the pvc gets mounted under /data/ with the following permissions:
By default the pvc-jobs don't specify a securityContext. As a result, at least in my enviroment they don't have enough permissions to modify files under /data/.
and this ran correctly. By entering the container I was able (as expected) to open and create files under /data/. |
So, I tested out another custom job which mounts the same security context as the default specified in your yaml files:
this actually succeeded:
So a solution to this issue would be making sure that the jobs are created with a correct securityContext. However (and correct me if I'm wrong) those jobs are created at https://github.com/Comcast/kuberhealthy/, not in your repo, so maybe we have to fix something on Comcast's side. |
In case someone is still battling with this. The problem is that the pvc init job runs as whatever the user of the docker image is (as it does not set a |
Hello,
I'd like to ask about storage check job, I'm only getting /bin/sh: 1: cannot create /data/index.html: Permission denied using storage-check-pvc-init-job. I've just tried to enable allowPrivilegeEscalation and it did't help. Normally we have no problem to write to PVs. I'm thinking about securityContext.readOnlyRootFilesystem, but this swich is quite dangerous for production as it is global switch.
Is it possible, that this test is not compatible with old StorageClass we are still using?
What Am I missing?
EDIT: nope, not working even with
securityContext.readOnlyRootFilesystem=false
The text was updated successfully, but these errors were encountered: