When fire several blueprints at once VCD responds with "duplicate name (400) error" #42

bostko · 2016-10-14T13:14:18Z

Environment

Vcloud Director 1.5
Latest jclouds 1.9.x
standard admin account (I am not aware for the exect limit for OPERATION_LIMITS_EXCEEDED protection)

Steps to reproduce
Launch it in Apache Brooklyn with jclouds-vcloud-director and fire where 10 blueprints are deployed simultaneously.

Observed behaviour
Digging out debug logs it appeared that when those 10 provisioning were triggered the last ~5 of them were failing on POST /vdc/{id}/action/composeVApp with OPERATION_LIMITS_EXCEEDED

Inspecting the log showed that between sending composeVapp request and receiving a response 3 or 4 check task status requests happened (GET /task/{taskId})
After a response from GET task a response is returned from composeVapp which says:

<Error xmlns="http://www.vmware.com/vcloud/v1.5" minorErrorCode="OPERATION_LIMITS_EXCEEDED" message="[ {{requestId}} ] The maximum number of simultaneous operations for user &quot;{{User}}&quot; on organization &quot;{{Organization}}&quot; has been reached." majorErrorCode="400" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.vmware.com/vcloud/v1.5

If I check VCloud Director web console I see that vApps are leftover in pending state from the failed composeVapp request.
They are left for ever in pending (not created nor started stage) and I assume no resources or vms are allocated)

Because of retry logic for such responses implemented previously in #41
jclouds-vcloud-director issues a second composeVapp request and then it fails with vApp duplication name.

Expected behavior
VCloud Director is not expected to return half done operations. Most business applications out there take care to fully abort unsuccessful operations.

Possible Workarounds that could be implemented in jclouds-vcloud-director

reduce polling status requests.
implement a locking mechanism and reduce or block jclouds-vcloud-director from making requests before or after composeVapp is issued
exclude composeVapp from retry logic in BaseHttpCommandExecutorService<Request> and implement a custom retry logic which also takes care for orphaned vApps which may be created with the initial vApp name.

The text was updated successfully, but these errors were encountered:

aledsage · 2016-10-14T13:39:10Z

It seems that the core problem is in VMware's vCloud Director implementation. We issue a POST /vdc/{id}/action/composeVApp and are rate-limited (getting back OPERATION_LIMITS_EXCEEDED), and yet VMware has partially executed the command! One would expect any sensible rate-limiting would either accept or reject the command, rather than partially executing it and then saying that the operation wasn't allowed!

Perhaps we should not think of VMware's response as rate-limiting. Instead perhaps we should think of it as VMware saying "I am unwilling to finish executing your request at this time due to excessive activity. I may or may not have partially executed your response before deciding that you weren't allowed to do it; your system may be in an unexpected state (e.g. resources partially created, and/or stuck in a "pending" state); it is your responsibility to check what state the system is now in, and to do any rollback required (e.g. try to delete the partially created resources); you can then retry. However, I may also reject these subsequent calls as well (potentially having partly executed them), if there is excessive activity at that time."

Clearly it is hard to program against an API with these semantics!

bostko changed the title ~~When fire several blueprints at once TAI2 respond with "duplicate name (400) error"~~ When fire several blueprints at once VCD respond with "duplicate name (400) error" Oct 14, 2016

bostko changed the title ~~When fire several blueprints at once VCD respond with "duplicate name (400) error"~~ When fire several blueprints at once VCD responds with "duplicate name (400) error" Oct 14, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When fire several blueprints at once VCD responds with "duplicate name (400) error" #42

When fire several blueprints at once VCD responds with "duplicate name (400) error" #42

bostko commented Oct 14, 2016

aledsage commented Oct 14, 2016

When fire several blueprints at once VCD responds with "duplicate name (400) error" #42

When fire several blueprints at once VCD responds with "duplicate name (400) error" #42

Comments

bostko commented Oct 14, 2016

aledsage commented Oct 14, 2016