Celery: Connection to broker lost. Trying to re-establish the connection...

Created on 23 Mar 2017  ·  40Comments  ·  Source: celery/celery

We are using Django Rest Framework with MongoEngine, Redis, Celery and Kombu, and we are getting the following error in our logs:

`[2017-03-22 06:26:01,702: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/celery/worker/consumer/consumer.py", line 318, in start
blueprint.start(self)
File "/usr/local/lib/python2.7/dist-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/usr/local/lib/python2.7/dist-packages/celery/worker/consumer/consumer.py", line 594, in start
c.loop(c.loop_args())
File "/usr/local/lib/python2.7/dist-packages/celery/worker/loops.py", line 88, in asynloop
next(loop)
File "/usr/local/lib/python2.7/dist-packages/kombu/async/hub.py", line 345, in create_loop
cb(
cbargs)
File "/usr/local/lib/python2.7/dist-packages/kombu/transport/redis.py", line 1039, in on_readable
self.cycle.on_readable(fileno)
File "/usr/local/lib/python2.7/dist-packages/kombu/transport/redis.py", line 337, in on_readable
chan.handlerstype
File "/usr/local/lib/python2.7/dist-packages/kombu/transport/redis.py", line 667, in _receive
ret.append(self._receive_one(c))
File "/usr/local/lib/python2.7/dist-packages/kombu/transport/redis.py", line 678, in _receive_one
response = c.parse_response()
File "/usr/local/lib/python2.7/dist-packages/redis/client.py", line 2183, in parse_response
return self._execute(connection, connection.read_response)
File "/usr/local/lib/python2.7/dist-packages/redis/client.py", line 2176, in _execute
return command(*args)
File "/usr/local/lib/python2.7/dist-packages/redis/connection.py", line 577, in read_response
response = self._parser.read_response()
File "/usr/local/lib/python2.7/dist-packages/redis/connection.py", line 238, in read_response
response = self._buffer.readline()
File "/usr/local/lib/python2.7/dist-packages/redis/connection.py", line 168, in readline
self._read_from_socket()
File "/usr/local/lib/python2.7/dist-packages/redis/connection.py", line 143, in _read_from_socket
(e.args,))
ConnectionError: Error while reading from socket: ('Connection closed by server.',)

[2017-03-22 06:26:01,868: INFO/MainProcess] Connected to redis://:**********/1

Versions used

python==2.7.12
redis==3.2.3
kombu==4.0.2
Django==1.10.3
celery==4.0.2
amqp==2.1.1
billiard==3.5.0.2
pytz==2016.7
Django==1.10.3o

We have HaProxy infront of Redis cluster and connections, everything else work without any issues. Could you please help us trouble shoot this error ?
Or guide on what to look for and where to look for , etc ?

We really appreciate your help,

Thanks,
Darsana

RabbitMQ Broker Bug Report Major Sprint Candidate

Most helpful comment

Same issue with celery == 4.2.1
how do I resolve this ?

All 40 comments

@dharanpdarsana Do you have redis running? Here is how you can validate -

$ redis-cli                                                                
redis 127.0.0.1:6379> ping
PONG
redis 127.0.0.1:6379> set mykey somevalue
OK
redis 127.0.0.1:6379> get mykey
"somevalue"

@jpatel3 redis is running and 90% of the connections are going through without any issue

I am also (suddenly) experiencing a broken pipe error.
It was working perfectly fine before. I've forgotten what I have installed or did to trigger this.

[2017-05-22 17:18:10,939: INFO/MainProcess] Connected to amqp://user123:**@192.168.99.100:5672//
[2017-05-22 17:18:10,955: INFO/MainProcess] mingle: searching for neighbors
[2017-05-22 17:18:10,965: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", line 318, in start
    blueprint.start(self)
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/celery/bootsteps.py", line 119, in start
    step.start(parent)
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/celery/worker/consumer/mingle.py", line 38, in start
    self.sync(c)
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/celery/worker/consumer/mingle.py", line 42, in sync
    replies = self.send_hello(c)
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/celery/worker/consumer/mingle.py", line 55, in send_hello
    replies = inspect.hello(c.hostname, our_revoked._data) or {}
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/celery/app/control.py", line 129, in hello
    return self._request('hello', from_node=from_node, revoked=revoked)
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/celery/app/control.py", line 81, in _request
    timeout=self.timeout, reply=True,
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/celery/app/control.py", line 436, in broadcast
    limit, callback, channel=channel,
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/kombu/pidbox.py", line 315, in _broadcast
    serializer=serializer)
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/kombu/pidbox.py", line 290, in _publish
    serializer=serializer,
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/kombu/messaging.py", line 181, in publish
    exchange_name, declare,
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/kombu/messaging.py", line 194, in _publish
    [maybe_declare(entity) for entity in declare]
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/kombu/messaging.py", line 194, in <listcomp>
    [maybe_declare(entity) for entity in declare]
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/kombu/messaging.py", line 102, in maybe_declare
    return maybe_declare(entity, self.channel, retry, **retry_policy)
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/kombu/common.py", line 125, in maybe_declare
    return _maybe_declare(entity, declared, ident, channel, orig)
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/kombu/common.py", line 131, in _maybe_declare
    entity.declare(channel=channel)
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/kombu/entity.py", line 185, in declare
    nowait=nowait, passive=passive,
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/amqp/channel.py", line 630, in exchange_declare
    wait=None if nowait else spec.Exchange.DeclareOk,
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/amqp/abstract_channel.py", line 64, in send_method
    conn.frame_writer(1, self.channel_id, sig, args, content)
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/amqp/method_framing.py", line 174, in write_frame
    write(view[:offset])
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/amqp/transport.py", line 269, in write
    self._write(s)
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/eventlet/greenio/base.py", line 397, in sendall
    tail = self.send(data, flags)
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/eventlet/greenio/base.py", line 391, in send
    return self._send_loop(self.fd.send, data, flags)
  File "/Users/richmond/Projects/backend/env/lib/python3.6/site-packages/eventlet/greenio/base.py", line 378, in _send_loop
    return send_method(data, *args)
BrokenPipeError: [Errno 32] Broken pipe
[2017-05-22 17:18:11,032: INFO/MainProcess] Connected to amqp://user123:**@192.168.99.100:5672//
[2017-05-22 17:18:11,046: INFO/MainProcess] mingle: searching for neighbors
[2017-05-22 17:18:12,075: INFO/MainProcess] mingle: all alone
[2017-05-22 17:18:12,104: INFO/MainProcess] [email protected] ready.

$ pip freeze

alabaster==0.7.10
alembic==0.9.1
amqp==2.1.4
aniso8601==1.2.0
appdirs==1.4.3
argh==0.26.2
Babel==2.4.0
billiard==3.5.0.2
blinker==1.4
bumpversion==0.5.3
cached-property==1.3.0
celery==4.0.2
cffi==1.10.0
click==6.7
colorama==0.3.8
coverage==4.1
cryptography==1.7
docker==2.2.1
docker-compose==1.11.2
docker-pycreds==0.2.1
dockerpty==0.4.1
docopt==0.6.2
docutils==0.13.1
ecdsa==0.13
enum-compat==0.0.2
eventlet==0.21.0
factory-boy==2.8.1
Faker==0.7.11
flake8==2.6.0
Flask==0.12
Flask-Babel==0.11.2
Flask-Cors==3.0.2
Flask-GraphQL==1.4.0
Flask-Mail==0.9.1
Flask-Migrate==2.0.3
Flask-RESTful==0.3.5
Flask-Script==2.0.5
Flask-SocketIO==2.8.6
Flask-SQLAlchemy==2.1
future==0.16.0
graphene==1.1.3
graphene-sqlalchemy==1.1.1
graphql-core==1.1
graphql-relay==0.4.5
greenlet==0.4.12
gunicorn==19.7.1
idna==2.5
imagesize==0.7.1
inflection==0.3.1
iso8601==0.1.11
itsdangerous==0.24
Jinja2==2.9.6
jsonschema==2.6.0
kombu==4.0.2
Mako==1.0.6
MarkupSafe==1.0
mccabe==0.5.3
mod-wsgi==4.5.15
ndg-httpsclient==0.4.2
packaging==16.8
pathtools==0.1.2
pluggy==0.3.1
promise==2.0
py==1.4.33
pyasn1==0.2.3
pycodestyle==2.0.0
pycparser==2.17
pycrypto==2.6.1
pyflakes==1.2.3
Pygments==2.2.0
PyMySQL==0.7.10
pyOpenSSL==17.0.0
pyparsing==2.2.0
python-dateutil==2.6.0
python-editor==1.0.3
python-engineio==1.4.0
python-jose==1.3.2
python-socketio==1.7.4
pytz==2017.2
PyYAML==3.11
raven==5.32.0
requests==2.11.1
singledispatch==3.4.0.3
six==1.10.0
snowballstemmer==1.2.1
Sphinx==1.4.8
SQLAlchemy==1.1.5
texttable==0.8.8
tox==2.3.1
typing==3.6.1
vine==1.1.3
virtualenv==15.1.0
voluptuous==0.9.3
watchdog==0.8.3
websocket-client==0.40.0
Werkzeug==0.12.1

I have the same problem and I'm using the same versions and I can't find a solution :(

I am using kombu and running into the same situation, I'm not sure but I think maybe it's an issue working with haproxy.

When connected to haproxy which routes connections to several rabbitmq nodes, both producers and consumers have the same "connection to broker lost" issue, but when directly connected to a rabbitmq node, the issue never occurs.

Also seeing this, and also behind haproxy:

kombu==4.0.2
celery==4.0.2
redis==2.10.5

Maybe it's happened SQS brokers too:

boto==2.47.0
celery==4.0.2
kombu==4.0.2

The same problem here, Celery with RabbitMQ and eventlet.

amqp==2.2.1
billiard==3.5.0.3
celery==4.1.0
Django==1.8.5
eventlet==0.21.0
kombu==4.1.0
pytz==2017.2
vine==1.1.4

[2017-08-10 06:35:39,629: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/celery/worker/consumer/consumer.py", line 320, in start
    blueprint.start(self)
  File "/usr/lib/python2.7/site-packages/celery/bootsteps.py", line 119, in start
    step.start(parent)
  File "/usr/lib/python2.7/site-packages/celery/worker/consumer/consumer.py", line 596, in start
    c.loop(*c.loop_args())
  File "/usr/lib/python2.7/site-packages/celery/worker/loops.py", line 118, in synloop
    connection.drain_events(timeout=2.0)
  File "/usr/lib/python2.7/site-packages/kombu/connection.py", line 301, in drain_events
    return self.transport.drain_events(self.connection, **kwargs)
  File "/usr/lib/python2.7/site-packages/kombu/transport/pyamqp.py", line 103, in drain_events
    return connection.drain_events(**kwargs)
  File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 485, in drain_events
    while not self.blocking_read(timeout):
  File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 490, in blocking_read
    frame = self.transport.read_frame()
  File "/usr/lib/python2.7/site-packages/amqp/transport.py", line 240, in read_frame
    frame_header = read(7, True)
  File "/usr/lib/python2.7/site-packages/amqp/transport.py", line 415, in _read
    s = recv(n - len(rbuf))
  File "/usr/lib/python2.7/site-packages/eventlet/greenio/base.py", line 360, in recv
    return self._recv_loop(self.fd.recv, b'', bufsize, flags)
  File "/usr/lib/python2.7/site-packages/eventlet/greenio/base.py", line 354, in _recv_loop
    self._read_trampoline()
  File "/usr/lib/python2.7/site-packages/eventlet/greenio/base.py", line 325, in _read_trampoline
    timeout_exc=socket_timeout('timed out'))
  File "/usr/lib/python2.7/site-packages/eventlet/greenio/base.py", line 207, in _trampoline
    mark_as_closed=self._mark_as_closed)
  File "/usr/lib/python2.7/site-packages/eventlet/hubs/__init__.py", line 163, in trampoline
    return hub.switch()
  File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 295, in switch
    return self.greenlet.switch()
  File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 347, in run
    self.wait(sleep_time)
  File "/usr/lib/python2.7/site-packages/eventlet/hubs/poll.py", line 84, in wait
    presult = self.do_poll(seconds)
  File "/usr/lib/python2.7/site-packages/eventlet/hubs/epolls.py", line 61, in do_poll
    return self.poll.poll(seconds)
  File "/usr/lib/python2.7/site-packages/celery/apps/worker.py", line 333, in restart_worker_sig_handler
    safe_say('Restarting celery worker ({0})'.format(' '.join(sys.argv)))
  File "/usr/lib/python2.7/site-packages/celery/apps/worker.py", line 89, in safe_say
    print('\n{0}'.format(msg), file=sys.__stderr__)
IOError: [Errno 32] Broken pipe
[2017-08-10 06:35:39,820: INFO/MainProcess] Connected to amqp://guest:**@127.0.0.1:5672//
[2017-08-10 06:35:39,940: INFO/MainProcess] mingle: searching for neighbors
[2017-08-10 06:35:40,975: INFO/MainProcess] mingle: all alone
[2017-08-10 06:35:41,115: INFO/MainProcess] pidbox: Connected to amqp://guest:**@127.0.0.1:5672//.

I'm having this same problem with RabbitMQ running on localhost. It's odd that local connections would time-out or lose connection. I do have some pretty intensive tasks, and my CPU is at 100%, but my memory usage is low. Would that cause dropped connections?

celery==4.0.2
kombu==4.0.2
Django==1.8.6
RabbitMQ==3.2.4

@chrisspen, were you able to resolve the issue? I am having a similar scenario, where while performing an CPU intensive task, the connection to redis gets dropped: the following is one of the many manifestation of exceptions which I receive:

[2017-10-11 13:34:57,244: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
  File "/usr/local/python/lib/python2.7/site-packages/celery/worker/consumer/consumer.py", line 318, in start
    blueprint.start(self)
  File "/usr/local/python/lib/python2.7/site-packages/celery/bootsteps.py", line 119, in start
    step.start(parent)
  File "/usr/local/python/lib/python2.7/site-packages/celery/worker/consumer/consumer.py", line 594, in start
    c.loop(*c.loop_args())
  File "/usr/local/python/lib/python2.7/site-packages/celery/worker/loops.py", line 88, in asynloop
    next(loop)
  File "/usr/local/python/lib/python2.7/site-packages/kombu/async/hub.py", line 345, in create_loop
    cb(*cbargs)
  File "/usr/local/python/lib/python2.7/site-packages/kombu/transport/redis.py", line 1039, in on_readable
    self.cycle.on_readable(fileno)
  File "/usr/local/python/lib/python2.7/site-packages/kombu/transport/redis.py", line 337, in on_readable
    chan.handlers[type]()
  File "/usr/local/python/lib/python2.7/site-packages/kombu/transport/redis.py", line 714, in _brpop_read
    **options)
  File "/usr/local/python/lib/python2.7/site-packages/redis/client.py", line 585, in parse_response
    response = connection.read_response()
  File "/usr/local/python/lib/python2.7/site-packages/redis/connection.py", line 577, in read_response
    response = self._parser.read_response()
  File "/usr/local/python/lib/python2.7/site-packages/redis/connection.py", line 238, in read_response
    response = self._buffer.readline()
  File "/usr/local/python/lib/python2.7/site-packages/redis/connection.py", line 168, in readline
    self._read_from_socket()
  File "/usr/local/python/lib/python2.7/site-packages/redis/connection.py", line 143, in _read_from_socket
    (e.args,))
ConnectionError: Error while reading from socket: ('Connection closed by server.',)

I also had a " Connection to broker lost" using rabbit + celery under high CPU usage; not sure is related.

this might be an issue with HAProxy

@auvipy I doubt it. I do not use HAProxy and have been trying to use Celery 4.0.3 to migrate from 3.1.25.

I still get this error.

Let wait for 4.2 release

I'm having this same problem with Redis running on localhost. My redis config timeout value is 5. When too many tasks executed beyond 5 seconds, it would result in this exception:

[2018-03-28 10:13:09,557: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
  File "/usr/local/python2.7/lib/python2.7/site-packages/celery-4.2.0rc1-py2.7.egg/celery/worker/consumer/consumer.py", line 322, in start
    blueprint.start(self)
  File "/usr/local/python2.7/lib/python2.7/site-packages/celery-4.2.0rc1-py2.7.egg/celery/bootsteps.py", line 119, in start
    step.start(parent)
  File "/usr/local/python2.7/lib/python2.7/site-packages/celery-4.2.0rc1-py2.7.egg/celery/worker/consumer/consumer.py", line 598, in start
    c.loop(*c.loop_args())
  File "/usr/local/python2.7/lib/python2.7/site-packages/celery-4.2.0rc1-py2.7.egg/celery/worker/loops.py", line 91, in asynloop
    next(loop)
  File "/usr/local/python2.7/lib/python2.7/site-packages/kombu/async/hub.py", line 354, in create_loop
    cb(*cbargs)
  File "/usr/local/python2.7/lib/python2.7/site-packages/kombu/transport/redis.py", line 1040, in on_readable
    self.cycle.on_readable(fileno)
  File "/usr/local/python2.7/lib/python2.7/site-packages/kombu/transport/redis.py", line 337, in on_readable
    chan.handlers[type]()
  File "/usr/local/python2.7/lib/python2.7/site-packages/kombu/transport/redis.py", line 714, in _brpop_read
    **options)
  File "/usr/local/python2.7/lib/python2.7/site-packages/redis/client.py", line 680, in parse_response
    response = connection.read_response()
  File "/usr/local/python2.7/lib/python2.7/site-packages/redis/connection.py", line 624, in read_response
    response = self._parser.read_response()
  File "/usr/local/python2.7/lib/python2.7/site-packages/redis/connection.py", line 284, in read_response
    response = self._buffer.readline()
  File "/usr/local/python2.7/lib/python2.7/site-packages/redis/connection.py", line 216, in readline
    self._read_from_socket()
  File "/usr/local/python2.7/lib/python2.7/site-packages/redis/connection.py", line 191, in _read_from_socket
    (e.args,))
ConnectionError: Error while reading from socket: ('Connection closed by server.',)
[2018-03-28 10:13:09,558: WARNING/MainProcess] Restoring 7 unacknowledged message(s)

And i found that if argument for celery application is too long, such as a string of length 65000 ,and the connection between celery and redis with closed and can't reconnect.
Here is the example:

# filename: task.py
# test for celery task
import time
from celery import Celery
app = Celery('tasks', broker='redis://:@127.0.0.1:6379/2')

@app.task
def test(data):
    time.sleep(20)

start 2 celery worker

celery -A task worker -c 2 -l info 
# filename: test.py
# produce celery task 
from mytask import test
for i in range(40):
    test.delay('1'*65000)
 ```
run  test 
```bash
python test.py

Wait for about 2mins, then run this command and you will find no connection between celery and redis is ESTABLISHED.

netstat -natp | grep 6379 | grep ESTABLISHED

It indicated that connections between celery and redis were closed and reconnected failed.So what cause dropped connections and reconnected failed?

celery-4.1.0 or celery-4.2.0rc1
kombu==4.1.0
billiard==3.5.0.3
pytz==2018.3
amqp==2.2.2
vine==1.1.4

redis 4.0.2

I don't know why this happened, but if timeout of redis.conf is set to 0, problems will disappeared.
Thank you.

I have the same problem with amqp broker and:
kombu==4.2.1
celery==4.2.0

Same issue with celery==4.2.0 and kombu==4.2.1

anyone had any fix in practice?

Same issue with celery == 4.2.1
how do I resolve this ?

same issue, I am using celery==4.2.0, kombu==4.2.0, and Rabbitmq cluster as broker
any solution ?

I also have this problem. My logs are constantly spammed by this exceptions

Same here with latest celery, any fix available?

I try following to suppress logs. It's works fine for me.

  • SQS with botocore
  • Retry when connection errror
from billiard.exceptions import (
    SoftTimeLimitExceeded as CelerySoftTimeLimitExceeded,
)
from botocore.exceptions import (
    ClientError as BotocoreClientError,
    HTTPClientError as BotocoreHTTPClientError,
)

@app.task(
    autoretry_for=(
        BotocoreClientError,
        BotocoreHTTPClientError,
        CelerySoftTimeLimitExceeded,
    ),
)
def my_task():
    pass

hi.
i have the same problem when i use package version celery 4.2.1 and amqp 2.4.0.
now i try celery 4.2.1 and amqp 2.3.2, it seems good.
work well.
hope can help somebody who have the problem.

@auvipy, I can see that this issue was closed but I am not sure if there is a fix for it yet or if one is planned... It would be great if you could confirm what the current status is, thanks.

could you please try celery and kombu from the master and report again?

I'm also having this issue. I tried the solution from @shaoeChen above using 2.3.2 amqp and 4.2.1 celery but still having this problem

Same issue,
celery 4.2.1
kombu 4.3.0
amqp 2.4.1
redis 2.10.5

Fixed when downgrading to amqp 2.3.2 as per @shaoeChen's answer.

In the special case of AWS ELB + cluster of Rabbit MQ, this configuration seems to solve it:
image

Default idle timeout is 60.

Same issue

redis==3.2.1
celery[redis]==4.3.0

Restart celery will be ok

try celery==4.4.0rc5

@auvipy would be good if you reference the change in celery==4.4.0rc5 that fixes this issue for posterity.

FWIW, bumping to 4.4.0rc5 seems to have stabilized this issue for me as well.

celery==4.4.0 is the current stable release. try that & report back.

celery==4.4.0
kombu==4.6.7
works (better).

Thanks for referencing 4.4.x milestone

UPDATE: Whereas before, it was dropping connections frequently (roughly every minute)... now it seems to be dropping connections much more infrequently (once an hour) ? I'd have to log it and do some deeper log analysis for a clearer signal.

I am facing the same issue
[2020-04-22 05:31:09,243: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection... Traceback (most recent call last): File "c:\users\administrator\code\user\cel_env\lib\site-packages\redis\connection.py", line 700, in send_packed_command sendall(self._sock, item) File "c:\users\administrator\code\user\cel_env\lib\site-packages\redis\_compat.py", line 8, in sendall return sock.sendall(*args, **kwargs) File "c:\users\administrator\code\user\cel_env\lib\site-packages\eventlet\greenio\base.py", line 403, in sendall tail = self.send(data, flags) File "c:\users\administrator\code\user\cel_env\lib\site-packages\eventlet\greenio\base.py", line 397, in send return self._send_loop(self.fd.send, data, flags) File "c:\users\administrator\code\user\cel_env\lib\site-packages\eventlet\greenio\base.py", line 384, in _send_loop return send_method(data, *args) ConnectionResetError: An existing connection was forcibly closed by the remote host

my pip list
billiard==3.6.3.0 BTrees==4.4.1 celery==4.4.2 kombu==4.6.8 redis==3.4.1

I'm facing the same issue as reported in https://github.com/apache/airflow/issues/11622. This started to occure after https://github.com/apache/airflow/pull/11336

The interesting thing is that it happens only when we try to run Celery worker in daemonized way.

celery==4.4.7
kombu==4.6.11
billiard==3.6.3.0
redis==3.5.3

FYI: I tried upgrading to 5.0 but no success.

I had a similar issue using the HAproxy provided by redis-ha Helm chart. Turns out its default timeout client was 30s, and Redis' default TCP keepalive interval is "about 5 minutes". Raising the timeout client and timeout server both to 330s, the problem seems to have been fixed.

I had a similar issue using the HAproxy provided by redis-ha Helm chart. Turns out its default timeout client was 30s, and Redis' default TCP keepalive interval is "about 5 minutes". Raising the timeout client and timeout server both to 330s, the problem seems to have been fixed.

should we provide this on the docs?

Well, I'm not sure whether it's the right solution, but I changed my EC2 instance on Amazon on which I use celery+rabbitmq+gunicorn+supervisor as part of my infrastructure of long (up to several hours long) tasks with thousands of messages managed by this 24/7 ...

... - from 1Gb memory-sized to that which has 2Gb (maybe you have different needs, so adjust accordingly)

As syslog file in Ubuntu showed - oom error was killing brokers and other things, and increasing server memory size helped to solve this.

Now my logs are clear from errors.
And no need to adjust my code.

Well, I'm not sure whether it's the right solution, but I changed my EC2 instance on Amazon on which I use celery+rabbitmq+gunicorn+supervisor as part of my infrastructure of long (up to several hours long) tasks with thousands of messages managed by this 24/7 ...

... - from 1Gb memory-sized to that which has 2Gb (maybe you have different needs, so adjust accordingly)

As syslog file in Ubuntu showed - oom error was killing brokers and other things, and increasing server memory size helped to solve this.

Now my logs are clear from errors.
And no need to adjust my code.

very practical!

Was this page helpful?
0 / 5 - 0 ratings