Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve delete event ignored or duplicated issue for two-way syncing. #19

Open
stefan-cardnell-rh opened this issue Sep 30, 2024 · 0 comments

Comments

@stefan-cardnell-rh
Copy link
Contributor

stefan-cardnell-rh commented Sep 30, 2024

kafka_skip has a couple issues when it comes to delete events. Consider the scenario:

  1. System A creates instance of Model A and emits event to topicA.
  2. System B consumes event and creates same instance, setting kafka_skip=True so producers listening to changes on Model A don't re-send the event to topicA (Debezium source, or python).
  3. Instance of ModelA is deleted in System B.
  4. Because kafka_skip=True was set to the instance in 2., the producers will filter the deletion event to topicA when it shouldn't.

For the producer, if we simply stop filtering on kafka_skip in the case of deletes (either in the debezium config or python producer code). This instead just leads to duplicated message issues in the following scenario:

  1. System B creates instance of Model A and emits event to topicA.
  2. System A consumes event and creates the same instance.
  3. System B then deletes the instance, emitting a deletion event to topicA.
  4. System A consumes the event and deletes its instance.
  5. Since there is no filtering for deletes, another delete event is fired to topicA.
  6. Since instances are already deleted, there should be no further errors. The only problem is deletion events are duplicated for each additional system listening to the topic.

kafka_skip was intended as a marker to not cause re-send events when consuming create/update operations. However this marker is not set during deletes, if it's even possible.

Having duplicated message flow isn't great since it will make debugging message flow harder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant