You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On RKE2 we have observed that the machine-provision pod can sometimes be stuck for hours due to the very large retry count of 4500. This mainly seems to happen in retrieve_connection_info, which by the way does not exit 1 even once it is done with all the retries.
Regardless of the actual cause making retrieve_connection_info fail all the time, wouldn't it make sense to have a more reasonable RETRY_COUNT here? This would cause the provisioning to fail faster and retry by creating a whole new machine.
The text was updated successfully, but these errors were encountered:
On RKE2 we have observed that the machine-provision pod can sometimes be stuck for hours due to the very large retry count of 4500. This mainly seems to happen in
retrieve_connection_info
, which by the way does notexit 1
even once it is done with all the retries.Regardless of the actual cause making
retrieve_connection_info
fail all the time, wouldn't it make sense to have a more reasonableRETRY_COUNT
here? This would cause the provisioning to fail faster and retry by creating a whole new machine.The text was updated successfully, but these errors were encountered: