Node-redis: Subscriptions do not receive data after reconnect

Created on 5 Jul 2017  ·  6Comments  ·  Source: NodeRedis/node-redis

node_redis: 2.7.1
redis: 3.2
platform: Amazon linux
description:
When redis' snapshotting occurs it disconnects the redis client. The client reconnects and resubscribes, however, they no longer receive data. This can be reproduced by issuing a SAVE command via the redis-cli.

JS

const redis = require('redis');
const redisClient = redis.createClient({
  host: config.redisEndpoint,
  port: config.redisPort,
});

redisClient.on('connect', () => {
  console.log('Connected to Redis');
  console.log('subscription_set:', redisClient.subscription_set);
});

redisClient.on('reconnecting', (stats) => {
  console.log('Reconnecting to Redis', stats);
});

redisClient.on('error', (error) => {
  console.log('Failed to connect to Redis: ', error);
});

redisClient.on('subscribe', (channel, count) => { // eslint-disable-line no-unused-vars
  console.log('Subscribed to: ', channel, count);
});

redisClient.on('message', (channel, message) => { // eslint-disable-line no-unused-vars
  console.log('Received from: ', channel, message);
});

redisClient.subscribe('test');

Logs

Initial connect

info: Connected to Redis
info: subscription_set:
info: Subscribed to:  test 1

Issuing a SAVE command in redis-cli

info: Error: Redis connection to 34.213.3.19:6379 failed - read ECONNRESET
info: Reconnecting to Redis delay=983, attempt=4, code=ECONNRESET, errno=ECONNRESET, syscall=connect, address=127.0.0.1, port=6379, total_retry_time=1118, times_connected=1
info: Connected to Redis
info: subscription_set: subscribe_test=test
info: Subscribed to:  test 1

It is at this point that the client will no longer receive any data published to the channel test. If you have any questions or need more info to reproduce please let me know.

bug

Most helpful comment

Ok, finally able to reproduce reliably. There are a few requirements:

  • redis 3.2.0 - MUST set tcp-keepalive 0
  • setup a channel
  • configure node script to connect, and subscribe to channel
  • wait 2 hours and 1 minute

What you will see is that every 2 hours (at least for me) is an ECONNRESET error. At this point, the socket appears to still be open in redis but maybe the node process has lost a reference to it and opened a new socket? Overnight I had like 1 client connected per 2 hour block plus the currently running process when issuing CLIENT LIST from redis-cli.

It's important to note that the clients were all successfully reconnecting but not receiving data. I have a hunch that the data was being pushed out, but to the old connection not the new one... but that's totally a hunch.

Now, i'm not 100% sure if this is a node_redis bug or actually a bug with node, so if this is the wrong place to file just let me know. I'd be happy to try and debug/step through but have limited time.

The fix was to set tcp-keepalive 300 which keeps the connection active... but this feels wonky since the client should be able to successfully reconnect.

All 6 comments

@BridgeAR @iamjem Might be related to #1249

Thanks for your detailed description, however I have not been able to reproduce it yet. When I issue a SAVE through redis-cli the connection is not reset.

When I manually stop the redis server and start it again it seems to receive messages fine.

I have been testing this on Mac, will try to reproduce it too on Linux

Hmmm... that's interesting. It was occurring with a totally stock redis installation built from source. The only option I changed was to allow remote connections. I'll take a look shortly and see if I can figure out what's up. @jdegger

@jdegger I'm not able to repro locally, only in AWS under super specific conditions. I can create a second copy of this AMI and provide code that will allow you to reproduce but it's seemingly something specific to running in AWS. Let me know if you're interested... otherwise I'd say go ahead and close the issue

Ok, finally able to reproduce reliably. There are a few requirements:

  • redis 3.2.0 - MUST set tcp-keepalive 0
  • setup a channel
  • configure node script to connect, and subscribe to channel
  • wait 2 hours and 1 minute

What you will see is that every 2 hours (at least for me) is an ECONNRESET error. At this point, the socket appears to still be open in redis but maybe the node process has lost a reference to it and opened a new socket? Overnight I had like 1 client connected per 2 hour block plus the currently running process when issuing CLIENT LIST from redis-cli.

It's important to note that the clients were all successfully reconnecting but not receiving data. I have a hunch that the data was being pushed out, but to the old connection not the new one... but that's totally a hunch.

Now, i'm not 100% sure if this is a node_redis bug or actually a bug with node, so if this is the wrong place to file just let me know. I'd be happy to try and debug/step through but have limited time.

The fix was to set tcp-keepalive 300 which keeps the connection active... but this feels wonky since the client should be able to successfully reconnect.

@shmendo

Our also shows that ECONNRESET error happens every 2 hours. But I checked tcp-keepalive, it is 300

CONFIG GET tcp-keepalive
1) "tcp-keepalive"
2) "300"
Was this page helpful?
0 / 5 - 0 ratings

Related issues

b96705008 picture b96705008  ·  7Comments

juriansluiman picture juriansluiman  ·  3Comments

Mickael-van-der-Beek picture Mickael-van-der-Beek  ·  6Comments

adamgajzlerowicz picture adamgajzlerowicz  ·  4Comments

jackycchen picture jackycchen  ·  4Comments