Celery: Proposal to deprecate Redis as a broker support. [Rejected]

Created on 24 Jun 2016  ·  54Comments  ·  Source: celery/celery

If we removed support for Redis, we'd be able to focus the time on RabbitMQ, or maybe Kafka/nsq.
There's also the huge task of porting to asyncio in Python 3.

I think this should be seriously considered. If we are to work on this for fun, then the popularity of the transport shouldn't matter.

In the end I will chose what I have the ability to work on anyway, but at least you can voice your concerns here.

Project Governance

Most helpful comment

Hi. I work for Redis Labs and until recently I was a celery user. @thedrow brought this to my attention, and we've discussed this internally. We're willing to help you guys, we think keeping redis as part of celery is important. I'm not sure yet whether I'll do it personally or someone else, but let's get the discussion on what needs to be done going.

All 54 comments

Nowadays alternatives to celery exist, e.g. huey and rq which explicitly focus on supporting redis as broker. When celery released, there were nothing.

@ask what do you think about also dropping sql broker support? I doubt there are many people who use an sql database in production. Even if they do, they actually shouldn't.
We also have docker which means deploying rabbitmq is literally one command away. It's not _that_ hard anymore.

I just dropped support for SQL as a broker, including all other brokers other than RabbitMQ / Redis / SQS / Qpid :)

(duplicate)

I'm not that tied into the celery community; regardless my $0.02:

I can empathize with your sentiments around the maintenance issues, and this sort of action is a step towards sustainability. But are there any other 'amputations' you could make?

On the 'supply' side of the equation, do you have any thoughts why there isn't more contribution to celery from the community?
A cursory look at the commits & PRs suggests celery is mainly the work of one person, in comparison to a couple of open source libraries I contribute to

@MaximilianR Speaking only for myself:

  1. This is a sizable codebase to walk into
  2. Especially if you are not an expert in kombu and ayncio
  3. And you don't have intuitions about the depth of reported issues

That said, I do use celery, so if there is a way I can be useful, I'd be happy to help.

@MaximilianR, There's been numerous contributors, but definitely I've done a lot of the work. The project grew too fast, and at some point I had the choice between fixing bugs or supporting users on IRC/email/StackOverflow etc. Especially things like the multiprocessing pool deadlock issue took well over 6 months of focused coding, when I should have been mentoring people. We were at that point almost the size of RabbitMQ in downloads, but where they had 8 people working full-time, I was the only one.

There's probably other things to amputate, but I don't think anything is as time consuming as Redis.

The work of porting amqp/kombu to full asyncio was also a major time sink, but necessary to solve a number of issues. It never completed though.

@ask Does your message above mean that you're still interesting in putting effort into SQS? I noticed that there is only one open SQS tickets here, but I still see the "Experimental" warning in the docs. Can you advise on the current status and future needs for SQS?

We would be using Celery + SQS in production, so I may be able to contribute some effort toward it, but don't really want to get into that situation if SQS is not a part of your long-term plan for the project.

@ask thanks for sharing that above. I can appreciate where you're coming from. I hope through this / other approaches celery can be easier to maintain and continue its success...

Definitely do what's best for your project, but I will certainly push to get rid of Celery, internally, if Redis support is dropped. I'll elaborate in private if you really want to know why we won't use RabbitMQ, but I don't really want to bash on other projects in public.

Related: suggesting that it's one command because that's how you develop, on Docker, might work for you, but that's not necessarily how the rest of us deploy.

@nicksloan SQS is now listed as a supported transport in the master docs: http://docs.celeryproject.org/en/master/getting-started/brokers/index.html

The work to rewrite the SQS transport into using async I/O was sponsored for a bit over $1000 afair.

@scoates The Redis transport is in a worse situation since it's really hacked together to use async I/O for reading messages, but publishing is still synchronous. The Python redis library is synchronous so there are multiple challenges still even for reading messages, and a lot of uncertainty on how that works. Bugs can easily be hiding there, and any changes to the redis-py library could have a drastic effects when we are using it in non-orthodox ways.

I want to support Redis, I really do. I was fighting for it when I worked at RabbitMQ/Pivotal as I felt that we need a common solution in Python for the patterns that Celery implements. But if the community and the companies using it are not invested in keeping it working, then I will be left fire fighting and at worst like now unable to fix serious problems with the transport that leads to people criticizing the project. This makes my life more miserable and decreases morale.

This might be too radical given celery's history, but have you thought about keeping only Redis over RabbitMQ?

@MaximilianR I have considered it, but my passion is in message passing and building correct distributed systems. Redis has yet to provide us with an implementation of BRPOPLPUSH that works for implementing message acknowledgments, and with the revelation of the inability to accept that it's impossible to rely on wall clock time in distributed systems, a basic fact, I'm more wary. RabbitMQ is at least attacking the problem space seriously, and there are other players like Kafka and NSQ. Lots of libraries are treating task queues as a simple list operation, but I refuse Celery to become one of them :)

@scoates: I wasn't talking about how I'm developing or something. I was saying that telling "rabbitmq is hard to deploy" in 2016 is nonsense. Even if you know nothing about deploying something somewhere, there is docker which really helps. Please, don't make wrong assumptions.

Thanks @ask for checking in with us. Celery is a great product and given the limited resources you (and the community) have I think focusing on the core product is the best decision. I can imagine how difficult it must be to try and offer the same great features using many different incompatible systems. From the consumer's point, I think it adds to the complexity of the product to have so many options. I am sure this will probably upset some of your users but I think sticking to less brokers is the right thing to do for the product. I'd like to see the focus on AMQP as it's standard, well accepted protocol and I think rabbitmq is a very good implementation of it.

It's sad that there really isn't all that much work to be done, but when you add it all up across all the transports and features it's not sustainable with just a few hours a week.

Since the Redis protocol is so simple, rewriting the transport to be fully async does not have to be that difficult. I've made a layer that enables Celery to support any Tornado library, the same could be made for asyncio and Twisted, so we may even have clients out there that we can reuse.

Python is changing drastically with asyncio in stdlib. We need new web frameworks, new network libraries and pretty much the whole ecosystem needs to adapt. It makes our job a bit easier since we don't have to keep maintaining our own event loop anymore, but the transition will require some work.

I also want to write a new worker in Go or similar now that we have a new message protocol with support for multiple languages. Redis is not the best match for messaging interoperability, as it does not implement message headers, properties etc. The AMQP protocol definitely has the upper hand there since it was one of the original use cases.

@ask My approach was to try to get companies maintain their brokers. So let's try to speak with someone from Redis labs and see if we get any traction.
The same for MongoDB and other services.
As for SQL brokers, I think they aren't a very good idea and we shouldn't support them.
We can extract the code instead of removing it, and look for someone to maintain it though. If there's not enough interest, then there's no need for them.
The only SQL database that can kinda be a broker is Postgres because it has Pub/Sub capabilities but that's not currently implemented anyway.

I think we can just deprecate it and see what happens. Maybe someone steps up for it, by taking over maintenance, or by sponsorship. If not we have a product that people need, but nobody wants to support, which is likely with 18 votes in total at this point. I don't think that will sway Redis Labs alone :)

But if the community and the companies using it are not invested in keeping it working, then I will be left fire fighting and at worst like now unable to fix serious problems with the transport that leads to people criticizing the project. This makes my life more miserable and decreases morale.

We are using celery with the redis broker on multiple projects and are depending on it, but I strongly agree with that sentiment. If you can't keep it at a high level of quality, it will only give celery a bad name and make everybody unhappy – you _and_ the users.

Hi. I work for Redis Labs and until recently I was a celery user. @thedrow brought this to my attention, and we've discussed this internally. We're willing to help you guys, we think keeping redis as part of celery is important. I'm not sure yet whether I'll do it personally or someone else, but let's get the discussion on what needs to be done going.

Like @dvirsky my work to Redis is completely sponsored by Redis Labs, and if we'll be able to help with this backend (hopefully yes) I'll be involved in helping to find the best solutions in the Redis side, and potentially even extending the Redis messaging support in order to facilitate certain things in the implementation. We could also stress the ability of the backend to use Sentinel / Redis Cluster at some point in order to have an HA-ready experience. I hope we'll have good news ASAP, currently evaluating the effort needed.

This is really good news! I do completely understand you would like to know the work involved before committing to it :)

It's 3 am here right now, and I haven't collected a list of issues but here are some short notes:

Celery does not define a set of features that are implemented with an interface to Redis, instead
it uses the AMQP API to use messaging in a generic way. Name-to-queue messaging, pub/sub and topic routing. Topic routing is not a strict requirement but the rest are critical to Celery's main features of 1) handling task messages, 2) sending/receiving monitoring events, and 3) broadcast messaging to workers to manage them (e.g. shutdown/increase concurrency/etc.).

1) Needs to be async so it does not block the worker for any operation

The current version uses the redis-py library, which is synchronous. I have hacked this to make
it consume messages in an async manner, but I suspect there are still bugs there since we don't know exactly what happens in the client. There may already be async redis clients available for
Tornado/asyncio which we could use, or worst case since we don't need much doing it raw may not
be so tricky.

2) Connection management

We have one connection consuming tasks, and one connection doing pubsub, then a pool
of connections to do out-of-band operations, like acknowledging messages or restoring unacknowledged messages. There is some confusion around error handling here, and we may not be closing all connections after use.

3) Message acknowledgment

Messages are only removed form the server after they are acked and we have a visibility timeout where all message consumers try to restore messages. Well it's a mess,
I can describe this in more detail later.

Thanks @ask, about point "1" it may make sense to directly implement the minimal async Redis support we need for the few commands we send directly inside the broker instead of depend on an external lib.

Note that only 2, and possibly some other small issues require a solution soon. It's largely working, but there are cases when messages can be lost due to the message acknowledgement but that is a wishlist item as the situation was much worse before the ack emulation. The worker can be blocked by the synchronous Redis client, which is causing bad performance, and in some rare cases hanging workers.

1) Needs to be async so it does not block the worker for any operation

Last I checked, the async redis clients were not perfect (there are a couple for tornado). What I did in these situations a lot of times was use a thread pool executor and futures to make redis-py behave like an async client. as long as you don't have too many concurrent tasks and a few threads will do, it works better than async clients.

EDIT: I haven't worked with python's asyncio directly yet, but from what I've seen it's pretty similar to tornado so this pattern will probably be easy to do.

@dvirsky We are not allowed to use threads, as we are also using fork. Python does not support this case, and even if a patch was produced for cpython we would have to carefully patch existing python libraries, at least the C extensions, to be safe in this manner. This was realized on the Python bug tracker some time ago, and which is why we have been doing the migration to async I/O. We also need to move there in general, since that's in the future of Python now with core async I/O support.

@antirez re:

about point "1" it may make sense to directly implement the minimal async Redis support we need for the few commands we send directly inside the broker instead of depend on an external lib.

I wouldn't do that. most of redis-py's code revolves around networking and connection managemement, not around implementing commands...

@dvirsky We have connection management generics that would work perfectly for that, so that should not be any time sink. Networking we don't have much of, but I think most is parsing the protocol and we already do that near manually or hooking into the existing redis classes.

@ask yes there's hiredis which is a dedicated C extension for parsing redis protocol, IIRC it can work with an async client as well. what strategy did you choose up until now in making redis async?

anyway I need to have a look at what's been happening with the async clients recently, I haven't done much python work in the past year and a half. I see that https://github.com/leporo/tornado-redis hasn't been very active.

There is not much of a strategy, we add the socket to the event loop and send e.g. BRPOP synchronously, then we wait for the socket to be readable and we read the response synchronously ;)

Since we have no way of restarting in the middle of parsing the response

@dvirsky If we use hiredis for anything other than protocol parsing we'll have compatibility problems with gevent/eventlet (when it's used instead of multiprocessing) and it will also make hiredis mandatory which portability problems if Celery is run on PyPy.

@dvirsky Also py-hiredis does not expose any functionality other than protocol parsing at the moment.

Python will be needing a decent async redis client at some point I believe, so starting something proper wouldn't be a bad idea. E.g refactoring some of the protocol parsing code in redis-py.

you're using gevent right now?

@dvirsky We support gevent, eventlet and multiprocessing (prefork) with async I/O. But if you support one you usually have the others.

Yes, there are times where gevent is more appropriate for tasks. e.g. when tasks just write to the database or perform an HTTP request.
It's easy enough to expose and register the FDs in gevent but that requires further work in py-hiredis.

@dvirsky Here's an example of how it's done with PyCurl. https://gist.github.com/GuoJing/5875326
You need to expose the FD in order to collaborate with gevent.

I was recently introduced to the issue surrounding celery support for Redis and while I haven't completed a thorough analysis I can say that writing/upgrading a python redis client with asyncio sounds like a good idea, but from recent benchmarks, that would be best implemented on top of asyncio+libuv to squeeze as much performance as possible.

For references see

Against this position--or to complicate matters, I would note that from my own experience with Redis clients, as developer and user, relying entirely on libuv will actually prevent the client from achieving max performance, and for that hiredis is a must. There is a ton of stuff going on under-the-hood in libuv that actually slows io in some cases, ioredis does it through Node and it's not really performant.

So the solution I would envisage has a mixed hiredis-async implementation. (Stated differently, a bunch of alternatives need to be tried with tons of cherry-picking and benchmarking)

@merl-dev The problem is still PyPy where hiredis+cffi is slower at parsing the redis protocol than the pure python protocol parser implementation.
At least my version also has some test failures with redis-py. See https://github.com/redis/hiredis-py/pull/46

It is entirely possible to write a hiredis client as a CPython extension. I'm still not 100% sure about CFFI.

let's get it

@thedrow this reminds me - have you looked at the new redis streams feature? It can be used as a way more robust message broker than the current stuff. It's going to be GA pretty soon.

We haven't. We don't have the capacity to refactor the redis broker at the moment.
We were hoping you guys had some spare hands to do so.

@thedrow we just might... not promising anything but we might dedicate some more resources to supporting the redis ecosystem proactively.

@dvirsky you mean this proposal, right? So you could use TAPPEND to enqueue, TREAD + TACK to dequeue, process and acknowledge.

@georgepsarakis it's no longer a proposal, it's working, and the prefix is X, i.e. XADD. see https://www.youtube.com/watch?v=ELDzy9lCFHQ

So what is the final call on supporting redis broker? Is is going to be deprecated anytime soon?

Not at the moment.

I couldn't tell which way the community is leaning but we're starting a new project and I'm hesitant to go w/ Redis as a broker due to the thought of deprecating it. Thoughts?

It's safe to assume the redis broker will stay for the foreseeable future. Too many people relay on it.

All the while #301 / 601 are going stale.

@ermik try to help to get that in on related issue.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

myisis picture myisis  ·  3Comments

fbm picture fbm  ·  3Comments

Xuexiang825 picture Xuexiang825  ·  3Comments

croth1 picture croth1  ·  3Comments

budlight picture budlight  ·  3Comments