Competition on the service object caused by naming conflicts #1465

AnSmith22 · 2023-10-04T21:12:26Z

AnSmith22
Oct 4, 2023

Describe the bug

Given a rabbitmq cluster instance named foo, the rabbitmq operator creates a headless service named foo-nodes and a client (cluster ip) service named foo. This naming mechanism can caused potential competition between multiple rabbitmq cluster instances.

Concretely, suppose that there is another rabbimq cluster instance named foo-nodes, and its client service will be named foo-nodes as well, and then the competition starts: when reconciling for foo, the controller will create/update foo-nodes as a headless service; when reconciling for foo-nodes, the controller will create/update foo-nodes as a client service. In fact, only one rabbitmq cluster instance will actually win because the service object can have only one controller owner reference (pointing to only one rabbitmq cluster instance), and the other one is desitined to never get its service object ready.

To Reproduce

Steps to reproduce the behavior:

Create a rabbitmq cluster instance called foo
Create a second one called foo-nodes

One can switch the order of the above steps and will observe symmetric behavior.

Expected behavior
The rabbitmq operator should never cause any naming conflicts for any rabbitmq cluster instances it manages.

Version and environment information

RabbitMQ Cluster Operator: 008f7c7

Additional context

A potential solution is to change the way it names the client service. For example, if the client service is named as foo-client, such naming conflicts will never happen (between rabbitmq cluster instances).

mkuratczyk · 2023-10-05T08:38:50Z

mkuratczyk
Oct 5, 2023
Maintainer

Is this a hypothetical problem or did someone actually created a cluster with -nodes suffix? :) It feels unlikely and within a namespace, you should coordinate to ensure uniqueness, just like you can't have to clusters called rmq for example. I guess we could add a simple name check to reject cluster names with -nodes suffix altogether (or check if another cluster with overlapping name is already present, but that's a bit more involved). Can you think of any other name conflicts?

Changing how we name services would be a breaking change for thousands of users, so I'm against that, especially for a (hypothetical?) corner case.

0 replies

Zerpet · 2023-10-05T09:32:52Z

Zerpet
Oct 5, 2023
Maintainer

Yes, this is a hypotetical scenario that can happen. You can create your own admission controllers using Gatekeeper and write your own policies to prevent such scenario. The cluster operator does not have persistent state, and as such, it must create names in a predictabable manner, so that it can "find" objects it owns after a Pod restart.

0 replies

AnSmith22 · 2023-10-05T16:35:45Z

AnSmith22
Oct 5, 2023
Author

Hi @mkuratczyk @Zerpet thanks for your reply!

As you said, it is indeed a corner case problem which happens when the users don't coordinate with each other well in the same namespace. And I would say that the low occurence of the bug does not reduce its severity once it happens :)

I totally understand that changing the service name would be a breaking change, and I think it would be very helpful to have some name checks to prevent such corner cases from the beginning.

Can you think of any other name conflicts?

I will go through the other objects and see if there are other similar potential conflicts.

1 reply

Zerpet Oct 16, 2023
Maintainer

I think it would be very helpful to have some name checks to prevent such corner cases from the beginning.

...by writing your own admission policies with Gatekeeper :-)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Competition on the service object caused by naming conflicts #1465

{{title}}

Replies: 3 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Competition on the service object caused by naming conflicts #1465

AnSmith22 Oct 4, 2023

Describe the bug

To Reproduce

Version and environment information

Additional context

Replies: 3 comments · 1 reply

mkuratczyk Oct 5, 2023 Maintainer

Zerpet Oct 5, 2023 Maintainer

AnSmith22 Oct 5, 2023 Author

Zerpet Oct 16, 2023 Maintainer

AnSmith22
Oct 4, 2023

Replies: 3 comments 1 reply

mkuratczyk
Oct 5, 2023
Maintainer

Zerpet
Oct 5, 2023
Maintainer

AnSmith22
Oct 5, 2023
Author

Zerpet Oct 16, 2023
Maintainer