Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is django-tree multiprocessing safe ? #5

Open
stuaxo opened this issue Nov 14, 2019 · 3 comments
Open

Is django-tree multiprocessing safe ? #5

stuaxo opened this issue Nov 14, 2019 · 3 comments

Comments

@stuaxo
Copy link

stuaxo commented Nov 14, 2019

Using django-treebeard with celery has been a bit of nightmare, it's mulitprocessing unsafe, and this caused a bit of a headache.

I'm guessing django-tree is MP safe, but the docs don't mention this. If it is, then I would mention it as it's a massive selling point over treebeard.

Digression / rant:- my own project uses treebeard (via djangocms), being only able to update one thing at a time (that uses TB underneath) is a severe limitation and really hamstrings my update code - but I'm stuck with it. If this is MP safe you might even attract more devs if you mention it.

@stuaxo
Copy link
Author

stuaxo commented Nov 14, 2019

Afaict the code that fails in TB does because generating a node id and creating it are separate - so when there are two processes, they can both get the same id - then, when it comes to writing only one succeeds.

@jacobjove
Copy link

Hi @BertrandBordage , any word on this?

@BertrandBordage
Copy link
Owner

Django-tree should be as multiprocessing safe as its PostgreSQL triggers are.
This means that a concurrent write may see a wait for the trigger to end, leading to triggers queueing up. This can make a big import much slower than it could. So it's multiprocessing safe, but slower than without the tree structure.

That being said, what my team & I usually do when importing large amounts of data (concurrently or not) is disable the trigger using the context manager, then force rebuild all paths. Here is a recent example, which made the script go from taking 2 hours to 15 minutes: https://github.com/dezede/dezede/blob/master/dezede/management/commands/import_melodies.py#L745-L763

I keep this issue open since it's true that I did not write unit tests for ensuring it works as intended.
In any case, after years using django-tree in multiprocessing production environments, I did not any data issue yet and never had to rebuild the tree structures, apart from the use case listed above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants