-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition causing celerybeat and celeryworker to fail #2727
Comments
Thanks for the great explanations in this bug report. It would be nice to fix it, yes. You're saying you have some code to fix it, I'd be interested in knowing more about your solution? |
I am currently using To ensure that all the migrations for django_celery_migrations_complete() {
count=$(python manage.py showmigrations django_celery_beat | grep '\[ \]' | wc -l)
if [[ $count -eq 0 ]]; then
return 0
fi
return 1
}
until django_celery_migrations_complete; do
>&2 echo 'Waiting for Django Celery migrations to complete...'
sleep 5
done
>&2 echo 'Django Celery migrations have been applied' The only thing I cannot avoid is a little code duplication. Since this requires database access, the excerpt above must be preceded by the same postgres health check present in the django service's entrypoint script. I am not sure how to keep things DRY, so I implemented it such that I have a separate entrypoint script for the django service, and another entrypoint script for the celery services like so:
|
Just to clarify one small thing, are you having this issue locally on in production? As a reminder/notes of things work locally:
I was initially thinking to make celery services depend on django in the docker-compose file. Not sure of the drawbacks of such solution though. Another idea I've read about was to declare another service just for migrations and make all others depend on it. The One other drawback I foresee with the solution is that it depends on the output of the presentational I've had some issues in the past when trying to parse that output, it was fine for my toy project, but not something I'm too keen to add to a popular open source project. These are just my thoughts from a quick look, let me know if I've missed anything. Other ideas are welcome of course. |
I am running this locally, and I am aware that migrations will be run automatically on the django container for local development.
I have tried using In the screenshots,
The project already pins a Django version, so in my opinion, that becomes less of an issue. An alternative is to issue raw SQL queries to verify that all the migrations for It is also an alternative to just not fix this race condition at all. Just document it as such, and tell users to first build, then run |
@malefice That's pretty much the reason for the Travis implementation for Docker. This initial migration issue that I wanted to fix (obviously failed) could avoid this condition if celerybeat had its own entrypoint file that we can specify in the compose file. |
if using docker-compose 1.29 or later, will the following condition work?
Also, we can always run migration before starting the server. |
Someone tried to fix that issue in #4459 and after taking further look at this, I'm not convinced it's a bug worth fixing. The error will happen once, the first time you start you server, and will pretty much go away for ever. The error being returned is quite clear, and fixing it properly would involve hiding the problem. I'm marking it as won't fix, unless someone comes up with a simple and foolproof solution... |
As discussed, we won't be implementing this. Automatically closing. |
What happened?
There is a race condition where
celerybeat
and/orceleryworker
services will fail to start if thedjango-celery-beat
migrations have not run. Bringing up the stack up again once those migrations have run will resume normally as expected.What should've happened instead?
While this can only occur under specific conditions (usually first run), it is best to have a guarantee that all services will be up and running 100% out-of-the-box. The
celerybeat
and/orceleryworker
services should have a health check in its entrypoint script prior to running its command script. I personally have some code for this if you guys are interested, but I want to hear your thoughts first.Steps to reproduce
Tested on Ubuntu 18.04.4 LTS, Docker Engine version 19.03.12, docker-compose version 1.17.1
Some screenshots
The text was updated successfully, but these errors were encountered: