-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: "STATUS" verb #859
Comments
Wow, looks like @danwinship had the same idea in #628. |
Two interesting observations from the CNI meeting: IP address exhaustion is funny, because DEL will still succeed even if ADD won't. So, a STATUS error of "no available IPs" should not keep the runtime from executing DELs, but should block ADDs It would be interesting if there was a way to return a STATUS response on ADD or DEL failure, so the runtime knows that everything is broken without having to also poll STATUS. |
Other considerations: We need to add some wording around timing. Plugins have to handle ADD even after they return a failure in STATUS (with a reasonable error message). Runtimes should avoid calling ADD if a STATUS has failed. |
How a stateless plugin record the status in an efficient way? Without memory or cache? |
Yeah, but I don't think that's the right answer any more. I mean, yes, CNI should have a way to indicate "I'm out of IPs". But for the case of Kubernetes network plugins, we should just accept that this is a more complicated problem than CNI network plugin readiness, and deal with this at the Kubernetes / CRI level. Eg, kubelet could have a mode in which it decides that the network is ready if and only if there is a host-network pod running on the node with the label |
I'm not a huge fan of using pod readiness to solely indicate plugin status: we've found it painful to have a pod readiness depend on the state of other pods and in some cases a network plugin may need multiple components to function to be ready. |
these are both valuable but different... I think dans solution is orthogonal to this, right? ...
with Dans proposal, cni providers can give vendor specific suggestions/heuristics on the state of things...that might not be reflected in individual ADD/DEL calls... with Caseys proposal, individual runtimes can cascade structured information up the kubelet->cri->cni API chain of command to be utilized in a more modular way/broader context (i.e. metrics or monitoring or retry calibration or whatever) |
I don't see why it needs to be painful... it seems like something somewhere in the system must know whether or not the network is ready, and you can just expose that via a trivial http server on the "main" pod.
Well, Casey mentioned "They're a shim to a daemon, and that daemon is down." and "Their configuration file is somehow invalid or broken" as use cases, and I'm saying I don't think CNI is the right level to solve those problems (unless people want to implement a CNI 2.0 that is totally different from the current spec). The CNI spec works well when your plugin is fully installed before anyone needs to use it, and remains fully available until after everyone is done with it. It's really not well design to handle the case of plugins that are installed after the CNI-consuming-system starts up, and which might occasionally go away and then come back (eg due to being upgraded). |
Yes, the problem is not exposing a "STATUS" verb on plugins, before this, a new CNI 2.0 is needed to run as daemon and handle plugin registration, periodical status check and dynamic configuration. Then "STATUS" verb will work well under the new spec. |
I think that's what's nice about a theoretical STATUS verb. We already have a protocol for runtimes to talk to network plugins - creating a new API just for readiness seems a bit unnecessary. Instead, you can put that logic in your plugin binary. If it makes sense internally for that to be an HTTP call, then great. It doesn't actually require any commitments to the initialization status of your plugin. I agree that it doesn't cover all possible cases with Kubernetes (i.e., the chicken-egg problem w.r.t. dropping configuration files and binaries), but, honestly, I'm having a hard time thinking of any that aren't covered by the "drop a configuration file on disk" initial gate. Besides, there could totally be daemonless CNI plugins, especially in a magical ipv6 world. (Someday soon, I hope). I don't think we want to restrict us to needing a kubernetes pod to signify readiness. |
One more thought: another advantage of adding a STATUS verb is that it also works with chained plugins. |
My point is that I wasn't talking about CNI plugins in #628, I was talking about Kubernetes networking. At one point in the distant past when everyone used the stock kube-proxy and NetworkPolicy didn't exist, it was possible to set up Kubernetes networking using a plugin that communicated with the rest of Kubernetes solely via CNI. This is no longer the case, and never will be the case again in the future. So when we have a problem with Kubernetes network plugins that does not apply to CNI plugins such as the ones in containernetworking/plugins/main, then that suggests that CNI is not the right level to try to solve the problem at, because it's not a problem with CNI, it's a problem with Kubernetes networking, of which CNI is just one piece. (Though again, I agree that CNI totally needs a way to say "I am out of IP addresses".) |
Ah. I see your point. Another way of framing this issue might be "what are the decisions I might make that are informed by network status?" Asking that question from the Kubernetes perspective (but remaining higher-level), I can think of a few:
It seems like we actually need two separate mechanisms to inform these decision points. So, perhaps, the spec wording of STATUS should be very explicit about only covering the latter case. (As an aside, NotReady on the k8s Node object is horribly overloaded. Is it just for scheduling? Should it inform eviction? It causes nodes to be removed from Cloud LBs, but doesn't affect EndpointSlices. The API we're working with here is just too coarse.) |
OK, there are a few pieces to this:
|
FYI, I've just thought of another potential use for Status: detecting that a node's configuration is untenable and reporting that back to the end-user, before any pods are scheduled. So, what if we were to return a status with multiple fields? Something like
Perhaps answers are optional... |
Picking this back up, as I keep running in to use cases for this. @danwinship as always, your analysis is spot on. There is a large scope for what "Status" could mean, and we need to leave whatever design open for evolution. I would really like to add "countable" resource support to CNI, but I think that's better tied in with a larger rethink in CNI 2.0. Unless we can come up with a simple expression that fits in the existing CNI 1.0 model, I'm loath to block any sort of "status" progress. At the end of the day, "Status" would be intended to be informational; it would not be a violation to see a "failed" status response and immediately call ADD. It would be silly, but allowed. The two status "flags" we care about are
Do we think it's useful to also add a third flag, given that DEL should "always" succeed?
|
/cc |
There is a that is part of the CRI API you don't need to create Pods, currently, the |
@squeed does this issue have any relation to CNI 2.0 ? |
An update from the most recent meeting: plugins need a way to know if they can rely on STATUS. If not, they need to wait until they are ready before writing their CNI configuration file. This is the same issue as discussed in #927 -- so we will have the discussion there. |
More discussion of kubelet-startup-network-readiness: kubernetes/kubernetes#120486 |
/assign |
Support for status has merged. |
wohoo |
@squeed what are the next steps? Implementing it in libcni? |
@aojea the next step is to cut a -rc and implement this in the plugins. Assuming there are no issues found, we can tag v1.1.0 and be done :) I hope to cut the rc next week. |
Sometimes plugins know that they can't accept any CNI requests.
Possible causes:
Right now, CRI (kubernetes)-flavored runtimes need to report network status, but they do so exclusively by checking for the existence of a CNI configuration file. It would be useful to provide more information to the end-user / administrator.
What if we added a "STATUS" verb? It would take a CNI configuration, but not be called in the context of a container.
The result should differentiate between permanent and temporary errors (which we already have some support for via error codes).
The text was updated successfully, but these errors were encountered: