Celery: DatabaseScheduler may don't work in celery 4.1.0

Created on 7 Aug 2017  ·  28Comments  ·  Source: celery/celery

I happened to install celery 4.1.0 with django_celey_beat 1.0.1, the DatabaseScheduler seems not working well.

[2017-08-07 21:12:10,790: DEBUG/MainProcess] DatabaseScheduler: Fetching database schedule
[2017-08-07 21:12:10,797: DEBUG/MainProcess] Current schedule:
[2017-08-07 21:12:10,807: DEBUG/MainProcess] beat: Ticking with max interval->5.00 seconds
[2017-08-07 21:12:10,809: DEBUG/MainProcess] beat: Waking up in 5.00 seconds.
[2017-08-07 21:12:15,813: DEBUG/MainProcess] beat: Synchronizing schedule...
[2017-08-07 21:12:15,813: INFO/MainProcess] Writing entries...
[2017-08-07 21:12:15,816: DEBUG/MainProcess] beat: Waking up in 5.00 seconds.
[2017-08-07 21:12:20,818: DEBUG/MainProcess] beat: Waking up in 5.00 seconds.
[2017-08-07 21:12:25,825: DEBUG/MainProcess] beat: Waking up in 5.00 seconds.
[2017-08-07 21:12:30,831: DEBUG/MainProcess] beat: Waking up in 5.00 seconds.
[2017-08-07 21:12:35,839: DEBUG/MainProcess] beat: Waking up in 5.00 seconds.
[2017-08-07 21:12:40,844: DEBUG/MainProcess] beat: Waking up in 5.00 seconds.
[2017-08-07 21:12:45,851: DEBUG/MainProcess] beat: Waking up in 5.00 seconds.
[2017-08-07 21:12:50,854: DEBUG/MainProcess] beat: Waking up in 5.00 seconds.
[2017-08-07 21:12:55,860: DEBUG/MainProcess] beat: Waking up in 5.00 seconds.
[2017-08-07 21:13:00,862: DEBUG/MainProcess] beat: Waking up in 5.00 seconds.
[2017-08-07 21:13:05,870: DEBUG/MainProcess] beat: Waking up in 5.00 seconds.
^C[2017-08-07 21:13:10,245: INFO/MainProcess] Writing entries...
[2017-08-07 21:13:10,246: INFO/MainProcess] Writing entries...

As u can see, the scheduler was supposed to send beat at every minute, but the beat didn't show up, (i've set crontab to be all *, so couldn't be timezone problem)

But in celery 4.0.2, everything goes fine! I don't know whether it is a bug or not. Maybe the django_celery_beat isn't compatible with 4.1.0.

[2017-08-07 21:18:43,339: DEBUG/MainProcess] Current schedule:
[2017-08-07 21:18:43,351: DEBUG/MainProcess] beat: Ticking with max interval->5.00 seconds
[2017-08-07 21:18:43,364: INFO/MainProcess] Scheduler: Sending due task schedule (GeneBank.tasks.test)
[2017-08-07 21:18:43,376: DEBUG/MainProcess] beat: Synchronizing schedule...
[2017-08-07 21:18:43,376: INFO/MainProcess] Writing entries...
[2017-08-07 21:18:43,380: DEBUG/MainProcess] GeneBank.tasks.test sent. id->9c1bdf10-0a5f-440a-98db-9eb24433a8d4
[2017-08-07 21:18:43,381: DEBUG/MainProcess] beat: Waking up in 5.00 seconds.
[2017-08-07 21:18:48,386: DEBUG/MainProcess] beat: Waking up in 5.00 seconds.
[2017-08-07 21:18:53,392: DEBUG/MainProcess] beat: Waking up in 5.00 seconds.
[2017-08-07 21:18:58,397: DEBUG/MainProcess] beat: Waking up in 1.59 seconds.
[2017-08-07 21:19:00,001: INFO/MainProcess] Scheduler: Sending due task schedule (GeneBank.tasks.test)

Celerybeat

Most helpful comment

@mchen-scala You have never written a complex system in your life, you cannot even understand the context of this ticket, which is that in the Celery 4 series there are some versions where the CELERY_BEAT_SCHEDULE is not followed if the timezone is not UTC.

Your points are all inaccurate and insulting on many levels. The joke is that you come here to insult us for working on an OSS project that suites the needs of many people across many industries. I am going to assume anything you have ever written for an employer hasn't seen as much light of day as the Celery project. As you actually haven't referenced any of your prior works. In an agile world you would be releasing every 2-4 weeks, so the fact that you are working on a project that you inherited instead of these awesome systems you supposedly built only indicates to me that you have an inflated sense of self worth.

Furthermore mchen-scala, I do encourage you to switch off celery--primarily because we don't need your attitudes in our community. I have a high paying job because I am able to leverage OSS and provide support to fixing issues as they arise. I suggest you follow your own mantra and stick to what you are good at, which apparently is rolling your own solutions to problems with existing solutions, and being an antisocial jerk to the rest of us. Cya!

All 28 comments

I'm using both and they are working just fine

after updating to version 4.1.0 from version 4.0.2 - got a similar error, the task scheduler does not work properly

same here, when I downgrade to 4.0.2 it works again.

I think this bug is related with timezone, when I change the timezone to UTC it works.

CELERY_TIMEZONE = 'UTC'
CELERY_ENABLE_UTC = True

Could you please re-check with current master branch?

I can confirm this bug in 4.1.0

in my settings:

CELERY_TIMEZONE = 'Europe/Moscow'

And yes it works fine with:

CELERY_TIMEZONE = 'UTC'
CELERY_ENABLE_UTC = True

Same issue -- we have multiple projects using celery beat but one of them happened to set the CELERY_TIMEZONE to be the project timezone which was America/NewYork. Literally I woke up with 18 million messages in the Rabbit QA server because the workers couldn't keep up with the rate they are enqueued -- hundreds a minute. Deleting the settings and letting the project default to CELERY_TIMEZONE of None also resolved the issue.

FWIW -- I don't think we are using the DatabaseScheduler. Maybe the issue name should be renamed?

@matteius @AyumuKasuga if you can run a test with the master branch to verify the fix, it would be great. Sorry for the issues.

Hello @georgepsarakis !
I have just tested master branch and unfortunately in my setup the issue is still exists.

Same here!

Hi,

I have a similar problem. I followed the django-celery-beat setup instructions from: http://docs.celeryproject.org/en/latest/userguide/periodic-tasks.html

CELERY_TIMEZONE = 'UTC'
CELERY_ENABLE_UTC = True
CELERY_RESULT_BACKEND = 'django-db'
CELERY_BEAT_SCHEDULER = 'django_celery_beat.schedulers:DatabaseScheduler'

Then I define a periodic task:

@periodic_task(run_every=timedelta(seconds=30))
def do_stuff():
   print("HI")

When starting celery beat, I get the following output:

$> DJANGO_SETTINGS_MODULE="proj.settings.dev" celery -A proj beat -l info
celery beat v4.1.0 (latentcall) is starting.
__    -    ... __   -        _
LocalTime -> 2017-08-28 15:58:44
Configuration ->
    . broker -> amqp://guest:**@localhost:5672//
    . loader -> celery.loaders.app.AppLoader
    . scheduler -> django_celery_beat.schedulers.DatabaseScheduler

    . logfile -> [stderr]@%INFO
    . maxinterval -> 5.00 seconds (5s)
[2017-08-28 15:58:44,425: INFO/MainProcess] Writing entries...
[2017-08-28 15:58:45,629: INFO/MainProcess] DatabaseScheduler: Schedule changed.
[2017-08-28 15:58:45,630: INFO/MainProcess] Writing entries...

The periodic task exists in the database and is marked as enabled.

However, a celery worker does not receive or execute any periodic tasks:

$> DJANGO_SETTINGS_MODULE="proj.settings.dev" celery -A proj worker -l info -E

 -------------- celery@proj-dev v4.0.2 (latentcall)
---- **** ----- 
--- * ***  * -- Linux-4.4.0-83-generic-x86_64-with-Ubuntu-16.04-xenial 2017-08-28 15:57:42
-- * - **** --- 
- ** ---------- [config]
- ** ---------- .> app:         proj:0x7f89f78faeb8
- ** ---------- .> transport:   amqp://guest:**@localhost:5672//
- ** ---------- .> results:     
- *** --- * --- .> concurrency: 1 (prefork)
-- ******* ---- .> task events: ON
--- ***** ----- 
 -------------- [queues]
                .> celery           exchange=celery(direct) key=celery


[tasks]
  . proj.tasks.do_nothgin
  . proj.tasks.do_stuff

[2017-08-28 15:57:42,269: INFO/MainProcess] Connected to amqp://guest:**@127.0.0.1:5672//
[2017-08-28 15:57:42,287: INFO/MainProcess] mingle: searching for neighbors
[2017-08-28 15:57:43,324: INFO/MainProcess] mingle: all alone

Software:

celery==4.1.0
django-celery-beat==1.0.1
django-celery-results==1.0.1
Django==1.8.2

Any ideas/help to resolving this issue is welcomed.

Best,
Sebastian

A piece of shit. It doesn't work.

@mchen-scala I think OSS projects are for collaborating and objective and constructive critics. How many lines of codes have you put in the piece of shit? I actually have Celery with beat working perfectly.

I have built far more advanced systems than you will ever have. I have written operating systems and built high-frequency trading platforms and large-scale cross-datacenter DBs. And many more.

Lines of code? I only look at systems that WORK. LOC is for tyros.

I have to use Celery because the system I inherited uses it. I am going to get rid of it and write my own once we are done our first delivery.

In addition to the beat problem, asking people to use a lock to ensure at-most-once property in task execution is a GIGANTIC JOKE.

Stick to what you are good at, which is OSS since you can't get high-paying jobs.

@mchen-scala You have never written a complex system in your life, you cannot even understand the context of this ticket, which is that in the Celery 4 series there are some versions where the CELERY_BEAT_SCHEDULE is not followed if the timezone is not UTC.

Your points are all inaccurate and insulting on many levels. The joke is that you come here to insult us for working on an OSS project that suites the needs of many people across many industries. I am going to assume anything you have ever written for an employer hasn't seen as much light of day as the Celery project. As you actually haven't referenced any of your prior works. In an agile world you would be releasing every 2-4 weeks, so the fact that you are working on a project that you inherited instead of these awesome systems you supposedly built only indicates to me that you have an inflated sense of self worth.

Furthermore mchen-scala, I do encourage you to switch off celery--primarily because we don't need your attitudes in our community. I have a high paying job because I am able to leverage OSS and provide support to fixing issues as they arise. I suggest you follow your own mantra and stick to what you are good at, which apparently is rolling your own solutions to problems with existing solutions, and being an antisocial jerk to the rest of us. Cya!

I have pin-pointed the exact line of code that is converting my already timezone aware datetime to a future datetime in an offset +20 that I am sure isn't a real timezone offset.
2017-10-11 22:42:27.041931-04:00 gets converted to 2017-10-12 22:42:27.041931+20:00 at the line in my Pull Request.
Apparently in UTC mode the datetime object stays the same at this point in the code. So what happens next is the result of remaining_delta is interpreted as being -1 day, 1:27:32.958069 behind the task schedule. So it sends out the task and doesn't sleep long because its always behind. It just keeps beating out the tasks, always becuase its -1 days until the task is due.

Granted my PR is commenting out a real old line of code, and yet all the unit tests seemed to pass and it did solve this issue in my testing. Looking from collaborator feedback.

I have shown this to be an issue on Python 2.7 as well as Python 3.5 and 3.6 versions. Not to say it isn't a bug in other versions, those are just ones I set up environments with.

I tried to migrate from not using the database for beat, to using django-celery-beat... The day have been mostly reading github issues :(

Is there anyone else not getting this to work when using CELERY_TIMEZONE = 'UTC'? I'm having problems getting it to work with that set as well..

@xeor You may also need to be setting CELERY_ENABLE_UTC = True

The actual issue of the datetime being passed into localize() is that if you have a local not UTC timezone set for your project, then that datetime already is correct going into localize and then dt = dt.astimezone(tz) transforms that into a non-nonsensical future datetime with a timezone that makes no sense.

@xeor I was having the same issue even with the following settings:

CELERY_TIMEZONE = 'UTC'
CELERY_ENABLE_UTC = True
CELERY_BEAT_SCHEDULER = 'django_celery_beat.schedulers:DatabaseScheduler'

Now, I know it seems silly, but I uninstalled both Celery and django-celery-beat, and installed them again with their latest version and it worked.

Thanks.. I have tried with all those 3 as well, without luck.
I will try it later with a clean rebuild of the environment..

@xeor Well it is very well possible that this issue isn't the issue you are experiencing, though maybe it is. The advice in this ticket has been consistent for all users experiencing this issue thus far, which lead to an out of control enqueuing of scheduled tasks and those tasking building up and not being processed properly because of that. Could you describe more your specific problem as it doesn't appear you left a lot of details about the error(s) or the unexpected outcome you are getting?

Always happy to help. Just noting that we had this issue without the DatabaseScheduler and fixed it by only changing the timezones. In my testing I showed that the schedule file generated before and after this bug were identical so I really don't think the bug is about the scheduler but rather the type of datetimes being passed into localize() call.

Thanks for the heads-up!

I wont be able to take a deeper loop before maybe late next week or the week after. I'll keep my self informed on the progress in this thread in the mean time.

I'm not sure if my setup is anything special, but I am using docker, amqp, rabbitmq and all the newest versions of celery python packages (not rabbitmq tho).. (sorry, I don't have the env here so I can check)..

I have similar issue, some times celery-beat does not work ( it doesn't send task to broker ) Additionally, it sends too much tasks at every 59 minute
I made a test task to run every minute

[2017-11-09 20:52:00,052: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)
[2017-11-09 20:53:00,049: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)
[2017-11-09 20:54:00,019: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)
[2017-11-09 20:55:00,027: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)
[2017-11-09 20:56:00,049: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)
[2017-11-09 20:57:00,004: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)
[2017-11-09 20:58:00,045: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)
[2017-11-09 20:59:00,032: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)
[2017-11-09 20:59:00,035: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)
[2017-11-09 20:59:00,037: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)
[2017-11-09 20:59:00,044: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)
...
[2017-11-09 20:59:59,977: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)
[2017-11-09 20:59:59,979: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)
[2017-11-09 20:59:59,981: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)
[2017-11-09 20:59:59,986: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)
[2017-11-09 20:59:59,989: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)
[2017-11-09 20:59:59,994: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)
[2017-11-09 20:59:59,997: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)
[2017-11-09 21:00:00,000: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)
[2017-11-09 21:01:00,047: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)
[2017-11-09 21:02:00,047: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)
[2017-11-09 21:03:00,053: INFO/MainProcess] Scheduler: Sending due task test-task (tasks.test.test_task)

On minute 59, bunch of tasks starts running and when the time hits minute 0 it runs as expected again.
I have no clue about this bug..?

This is my setting in celery 4.1.0

timezone = 'Asia/Seoul'
enable_utc = False

I am using file db for schedule

I have this issue as well on python 3.6.3, pytz 2017.3, django 1.11.7, celery 4.1.0 and django-celery-beat 1.1.0.

I wipe the database first:

#update django_celery_beat_periodictask set last_run_at = NULL;
#select name, last_run_at from django_celery_beat_periodictask;
           name           |          last_run_at          
--------------------------+-------------------------------
celery.backend_cleanup |                                                       
> pipenv run celery beat -A appname -l debug --scheduler django_celery_beat.schedulers:DatabaseScheduler
Loading .env environment variables…
celery beat v4.1.0 (latentcall) is starting.
__    -    ... __   -        _
LocalTime -> 2017-11-30 08:28:58
Configuration ->
    . broker -> amqp://guest:**@localhost:5672//
    . loader -> celery.loaders.app.AppLoader
    . scheduler -> django_celery_beat.schedulers.DatabaseScheduler

    . logfile -> [stderr]@%DEBUG
    . maxinterval -> 5.00 seconds (5s)
[2017-11-30 08:28:58,945: DEBUG/MainProcess] Setting default socket timeout to 30
[2017-11-30 08:28:58,946: INFO/MainProcess] beat: Starting...
[2017-11-30 08:28:58,946: DEBUG/MainProcess] DatabaseScheduler: initial read
[2017-11-30 08:28:58,946: INFO/MainProcess] Writing entries...
[2017-11-30 08:28:58,968: DEBUG/MainProcess] DatabaseScheduler: Fetching database schedule
[2017-11-30 08:28:59,068: DEBUG/MainProcess] Current schedule:
<ModelEntry: celery.backend_cleanup celery.backend_cleanup(*[], **{}) <crontab: 0 4 * * * (m/h/d/dM/MY)>>
[2017-11-30 08:28:59,115: INFO/MainProcess] DatabaseScheduler: Schedule changed.
[2017-11-30 08:28:59,115: INFO/MainProcess] Writing entries...
[2017-11-30 08:28:59,115: DEBUG/MainProcess] DatabaseScheduler: Fetching database schedule
[2017-11-30 08:28:59,121: DEBUG/MainProcess] Current schedule:
<ModelEntry: celery.backend_cleanup celery.backend_cleanup(*[], **{}) <crontab: 0 4 * * * (m/h/d/dM/MY)>>
[2017-11-30 08:28:59,122: DEBUG/MainProcess] beat: Ticking with max interval->5.00 seconds
[2017-11-30 08:28:59,138: DEBUG/MainProcess] Start from server, version: 0.9, properties: {'capabilities': {'publisher_confirms': True, 'exchange_exchange_bindings': True, 'basic.nack': True, 'consumer_cancel_notify': True, 'connection.blocked': True, 'consumer_priorities': True, 'authentication_failure_close': True, 'per_consumer_qos': True, 'direct_reply_to': True}, 'cluster_name': 'rabbit@Jupiter', 'copyright': 'Copyright (C) 2007-2017 Pivotal Software, Inc.', 'information': 'Licensed under the MPL.  See http://www.rabbitmq.com/', 'platform': 'Erlang/OTP 20.1', 'product': 'RabbitMQ', 'version': '3.6.14'}, mechanisms: [b'AMQPLAIN', b'PLAIN'], locales: ['en_US']
[2017-11-30 08:28:59,152: DEBUG/MainProcess] using channel_id: 1
[2017-11-30 08:28:59,153: DEBUG/MainProcess] Channel open
[2017-11-30 08:28:59,154: DEBUG/MainProcess] beat: Synchronizing schedule...
[2017-11-30 08:28:59,155: INFO/MainProcess] Writing entries...
[2017-11-30 08:28:59,160: INFO/MainProcess] Scheduler: Sending due task celery.backend_cleanup (celery.backend_cleanup)
[2017-11-30 08:28:59,161: DEBUG/MainProcess] celery.backend_cleanup sent. id->1dd626be-1dea-43ec-b000-ab61fdd33f9d
[2017-11-30 08:28:59,163: INFO/MainProcess] Scheduler: Sending due task celery.backend_cleanup (celery.backend_cleanup)
[2017-11-30 08:28:59,163: DEBUG/MainProcess] celery.backend_cleanup sent. id->7a9c7d44-e570-4a5a-9803-0a8e5111f035
[2017-11-30 08:28:59,165: INFO/MainProcess] Scheduler: Sending due task celery.backend_cleanup (celery.backend_cleanup)
[2017-11-30 08:28:59,166: DEBUG/MainProcess] celery.backend_cleanup sent. id->114ee8e1-4b3c-4f43-a632-9a249d7db364
[2017-11-30 08:28:59,167: INFO/MainProcess] Scheduler: Sending due task celery.backend_cleanup (celery.backend_cleanup)
[2017-11-30 08:28:59,168: DEBUG/MainProcess] celery.backend_cleanup sent. id->5b7f3825-d6c8-43a5-b056-2d567ec2c4df
[2017-11-30 08:28:59,170: INFO/MainProcess] Scheduler: Sending due task celery.backend_cleanup (celery.backend_cleanup)
[2017-11-30 08:28:59,171: DEBUG/MainProcess] celery.backend_cleanup sent. id->f1bfb936-0dd1-47b6-be10-3763d4446758
[2017-11-30 08:28:59,172: INFO/MainProcess] Scheduler: Sending due task celery.backend_cleanup (celery.backend_cleanup)
[2017-11-30 08:28:59,173: DEBUG/MainProcess] celery.backend_cleanup sent. id->7a12f2da-3717-45ab-b018-6b4fd7b83982
[2017-11-30 08:28:59,175: INFO/MainProcess] Scheduler: Sending due task celery.backend_cleanup (celery.backend_cleanup)
[2017-11-30 08:28:59,175: DEBUG/MainProcess] celery.backend_cleanup sent. id->64fbd61d-e80e-4a32-a49d-31ddc7e155c7
[2017-11-30 08:28:59,177: INFO/MainProcess] Scheduler: Sending due task celery.backend_cleanup (celery.backend_cleanup)
[2017-11-30 08:28:59,179: DEBUG/MainProcess] celery.backend_cleanup sent. id->ff38e88e-e7e8-4436-9724-9c416dde4d72
[2017-11-30 08:28:59,181: INFO/MainProcess] Scheduler: Sending due task celery.backend_cleanup (celery.backend_cleanup)
[2017-11-30 08:28:59,181: DEBUG/MainProcess] celery.backend_cleanup sent. id->d5116c47-df14-4f3e-a4d1-09087cd1af80
[2017-11-30 08:28:59,183: INFO/MainProcess] Scheduler: Sending due task celery.backend_cleanup (celery.backend_cleanup)
...

And the queue continues to fill at the rate of 600/sec.

# select name, last_run_at from django_celery_beat_periodictask;
           name           |          last_run_at          
--------------------------+-------------------------------
 celery.backend_cleanup   | 2017-11-30 16:40:59.352453-08 

My settings are (I set everything I could find because the documentation is extremely unclear and outdated in several places):

settings.py
CELERY_TIMEZONE = 'Canada/Pacific'
CELERY_ENABLE_UTC=False
USE_TZ = True
TIME_ZONE = 'Canada/Pacific'

celery.py
app = Celery('MyApp')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.conf.timezone = 'Canada/Pacific'
app.conf.enable_utc = False

So it's clear what is happening is that celery runs the task at 08:28:59-08, but then when storing the last_run_time, it is still adding 8 hours to the time to get 16:28:59-08 before storing it in the DB.

Taking a quick look at schedules.py tells me we are returning a timedelta or # seconds from crontab.is_due().

I don't have more time to keep digging here, but obviously something inside the crontab class is getting a timedelta between the current time and the current time with its tz replaced (not converted).

I would be very suspicious of lines that replace timezones.

Alright -- If everyone that had this bug could clone master and re-test that it fixes the issue for you. My PR was merged in last night and I just verified that it fixed the bug but it would be good to get additional confirmation of that for those that use the DB scheduler or other backends. Thanks!

976515108a4357397a3821332e944bb85550dfa2 apply this and check

Was this page helpful?
0 / 5 - 0 ratings