Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When fire several blueprints at once VCD responds with "duplicate name (400) error" #42

Open
bostko opened this issue Oct 14, 2016 · 1 comment

Comments

@bostko
Copy link
Contributor

bostko commented Oct 14, 2016

Environment

  • Vcloud Director 1.5
  • Latest jclouds 1.9.x
  • standard admin account (I am not aware for the exect limit for OPERATION_LIMITS_EXCEEDED protection)

Steps to reproduce
Launch it in Apache Brooklyn with jclouds-vcloud-director and fire where 10 blueprints are deployed simultaneously.

Observed behaviour
Digging out debug logs it appeared that when those 10 provisioning were triggered the last ~5 of them were failing on POST /vdc/{id}/action/composeVApp with OPERATION_LIMITS_EXCEEDED

Inspecting the log showed that between sending composeVapp request and receiving a response 3 or 4 check task status requests happened (GET /task/{taskId})
After a response from GET task a response is returned from composeVapp which says:

<Error xmlns="http://www.vmware.com/vcloud/v1.5" minorErrorCode="OPERATION_LIMITS_EXCEEDED" message="[ {{requestId}} ] The maximum number of simultaneous operations for user &quot;{{User}}&quot; on organization &quot;{{Organization}}&quot; has been reached." majorErrorCode="400" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.vmware.com/vcloud/v1.5

If I check VCloud Director web console I see that vApps are leftover in pending state from the failed composeVapp request.
They are left for ever in pending (not created nor started stage) and I assume no resources or vms are allocated)

Because of retry logic for such responses implemented previously in #41
jclouds-vcloud-director issues a second composeVapp request and then it fails with vApp duplication name.

Expected behavior
VCloud Director is not expected to return half done operations. Most business applications out there take care to fully abort unsuccessful operations.

Possible Workarounds that could be implemented in jclouds-vcloud-director

  • reduce polling status requests.
  • implement a locking mechanism and reduce or block jclouds-vcloud-director from making requests before or after composeVapp is issued
  • exclude composeVapp from retry logic in BaseHttpCommandExecutorService<Request> and implement a custom retry logic which also takes care for orphaned vApps which may be created with the initial vApp name.
@aledsage
Copy link
Member

It seems that the core problem is in VMware's vCloud Director implementation. We issue a POST /vdc/{id}/action/composeVApp and are rate-limited (getting back OPERATION_LIMITS_EXCEEDED), and yet VMware has partially executed the command! One would expect any sensible rate-limiting would either accept or reject the command, rather than partially executing it and then saying that the operation wasn't allowed!

Perhaps we should not think of VMware's response as rate-limiting. Instead perhaps we should think of it as VMware saying "I am unwilling to finish executing your request at this time due to excessive activity. I may or may not have partially executed your response before deciding that you weren't allowed to do it; your system may be in an unexpected state (e.g. resources partially created, and/or stuck in a "pending" state); it is your responsibility to check what state the system is now in, and to do any rollback required (e.g. try to delete the partially created resources); you can then retry. However, I may also reject these subsequent calls as well (potentially having partly executed them), if there is excessive activity at that time."

Clearly it is hard to program against an API with these semantics!

@bostko bostko changed the title When fire several blueprints at once TAI2 respond with "duplicate name (400) error" When fire several blueprints at once VCD respond with "duplicate name (400) error" Oct 14, 2016
@bostko bostko changed the title When fire several blueprints at once VCD respond with "duplicate name (400) error" When fire several blueprints at once VCD responds with "duplicate name (400) error" Oct 14, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants