Feathers: Event channels/rooms

Created on 12 Aug 2016  ·  24Comments  ·  Source: feathersjs/feathers

In many cases, service events need to be filtered to restrict their emission to a limited set of clients (1 to 1 or 1 to few users messages in a chat app for example). Filters allow to do this but at the cost of iterating through all connections, for each event... This can cause unnecessary server load when the number of events and/or connected clients is important.
One nice feature that could be considered to improve this would be to add some possibility to provide a list of connections to be considered by the dispatcher along with the service event. That way, the foreach loop (io.sockets.clients().forEach(function(socket) {...} ) could be performed only on this restricted list of connections (and on all connections by default, like already the case). Alternatively we could also pass a list of user _ids and keep a mapping of _id to connection automatically somewhere in feathers.
With this improvement, it would be possible, for example, to maintain a list of user._id <-> connection (if not handled by feathers) and to simply call the dispatcher to emit an event (like a 1 to 1 message) on a specific connection (retrieved directly from the user._id -> connection hash table, which is much quicker than having to iterate through all connections (with filters) to find the targeted one)

To give you an idea, I made a really simple benchmark to test the difference between making each time a really simple test (like connection.user._id === message.to_user_id) compared to looking for a given entry into a connection hash table :

Here are the results:

with 10k connections:
foreach method: 3.608ms
hash method: 0.073ms

with 100k connections:
foreach method: 27.761ms
hash method: 0.126ms

with 1M connections:
foreach method: 284.016ms
hash method: 0.288ms

So,, until 10k connections, the difference is still "reasonable". But passed this, the difference start to be significant (pretty much 1/3 second just to loop through connections in the dispatcher, for a single event, when 1M users. Meaning you wouldn't be able to treat more than 3 events/s (at best case) when 1M concurrent users whereas you could treat 1000x more if you knew the connection where to dispatch a priori).
And you have to keep in mind that number of events per second generally grows with number of users...
So the foreach loop in the dispatcher can became a real bottleneck if you want to use feathers for big apps... (If you have 1M concurent users, you wouldn't be able to treat more than 3 event/second on each node because of this)

Here is the code used for my really simple benchmark:

var nbconnections = 1000000;

var array = new Array(nbconnections);
console.time("foreach method");
array.forEach(function(socket) {
    if (socket === 0){

    }
});
console.timeEnd("foreach method");


var hash = {};
for (i = 0; i < nbconnections; i++) {
    hash[i+'']=i;
}
console.time("hash method");
var conn = hash['1000'];
console.timeEnd("hash method");
Breaking Change Proposal

Most helpful comment

I guess we'll only really know what is possible with a reproducible benchmark of a somewhat real-life app 😄

Now to my actual API proposal for all of this:

1) Events and event filtering as hooks

Refactor event emitting and filtering into a hook. @ekryski and I talked about this before. Basically feathers-hooks will become a dependency of core and event emitting becomes a hook that always runs last.

2) app.channel

Add an app.channel method that lets you stick a connection (the event emitter) into a "channel" keyed by a string:

const updateChannels = (connection, user) => {
  // A channel just for this user
  app.channel(user._id, connection);

  // For each user room add the socket to the room channel
  user.rooms.forEach(roomId => app.channel(`rooms/${roomId}`, connection));

  // Register the socket in the `company/<company_id>` channel
  app.channel(`company/${user.company_id}`, connection);
};
app.on('connection', updateChannels);
app.on('login', updateChannels);

The tricky part here is to figure out a way to keep those channels in sync when the user is updated, e.g. leaving a room while still being connected.

3) Updated event filtering

Change the event filtering to only run once per event. This means it will pass all connections instead of running for each individual connection. The filter then returns the connections to which the event should be dispatched. You can quickly (O(1)) grab connections based on their channel id:

app.service('messages').filter('eventname', (message, connections, hook) => {
  // Just dispatch to one user
  if(message.isPrivate) {
    return connections.channel(message.receiver_id);
  }

  // The message room channel
  return connections.channel(`rooms/${message.room_id}`);

  // EVERYBODY!
  return connections;

  // Filter connections manually, e.g. if the connection user and message user are friends
  return connections.filter(connection => connection.user.friends.indexOf(message.user) !== -1);
});

This will be a breaking change. Event filters will no longer be able to modify the data that are being dispatched. Modifying the event data can be done in hooks setting either hook.result or hook.dispatch (#376 is related). Chaining may still be possible to narrow down the connections but returning a channel will always return all connections in that channel. While we're at it we might as well make event filters mandatory.

All 24 comments

This came up before and I think it would be worth investigating a little more. It is very similar to the question of how to do rooms. Here is my first crack of how I think it could look like. Let me know what you think:

// The `channel` service method returns a string which is the key for a
// registered channel
app.service('messages').channel(message => `rooms/${message.room_id}`);
app.service('todos').channel(todo => `company/${todo.company_id}`);

app.on('connection', socket => {
  socket.on('authenticated', data => {
    // data.token
    // data.user
    const { token, user } = data;

    // For each user room add the socket to a channel
    user.rooms.forEach(roomId => app.channel(`rooms/${roomId}`, socket));
    // Register the socket in the `company/<company_id>` channel
    app.channel(`company/${company_id}`, socket);
  });
});

I'd also be interested in a performance benchmark of the current event filtering. So far it hasn't come up as a bottleneck but having an easier way to create channels/rooms and potentially increasing the performance at the same time would be great.

A room system similar to the one in socket.io is, I think, a good idea. For convenience, each socket should also have its own single room to which it automatically subscribe (like the default room from socket.io http://socket.io/docs/rooms-and-namespaces/), which makes it easy to emit an event to a single client (by broadcasting to the target client default room).
The question is how would you dispatch events to all subscribers of a room? If the it is some kind of filter that loop through all socket connections of the server to filter the one belonging to the desired room, you will end up with the same efficiency problem as the current dispatcher. If you only loop through sockets in the channel (=having joined the room to which the event must be broadcast), it would be OK.

I think that if, so far, the loop over connections in the dispatcher hasn't come has a bottleneck is just because no one has used feathers in a large app (with more than 5k concurrent users) yet. But this will come soon or later (feathers is still really new) and, from what I have seen with my really simple benchmark, no doubt this foreach loop would be an issue when you try to scale with feathers...
Making some "real" feathers stress test with a feathers single server node and some client machines emulating 1k to 100k client connections would be a good idea to see the current limits and bottlenecks of feather though. If you need some volunteers to run a script that perform thousand of concurrent connections to the server for this stress test, I can run it on two of my computers (that have two distinct internet connections) with pleasure.

Yes. Channels are to narrow down the number of clients on connection. If you e.g. create a specific user channel there will only be one entry in the connection list to go over. It has to be a little more generic because Primus does not support rooms (well only in a plugin which I don't think needs to be added).

To move this forward I think we need to:

  • Figure out (and eventually implement) the channel API which allows to stick connections into specific channels (or maybe they should be called rooms) and add the ability for services to narrow down to what channels data should be broadcast
  • Create a real-app benchmark for plain Socket.io, Feathers + event filters and a version for Feathers + channels to see what the improvements are

I would like to mention that in my experience more than 5k clients (with real websocket connections) per server in more than a basic app is a challenge with plain Socket.io as well - and physically impossible with 64k+ because the server will simply run out of ports (side note: Meteor chokes on a couple of hundred). 100k+ simultaneous connections on a single instance seems very unlikely so probably not a case we need to worry about (I found some Socket.io benchmarks here).

That said, I'm looking forward to the channel/room features because it is a frequently asked question that it would make much easier while also improving performance at the same time.

The 64k limit is a common misunderstanding. People tend to think that a server cannot accept more than 65,536 (216) TCP sockets because TCP ports are 16-bit integer numbers.
But in fact this is not the case as your server would actually only use one listening port for all sockets and distinguishes among sockets by using the IP address and the port of each client (so there is no theoretical limit on the number of sockets concurent connections a single server can handle).
I have seen reports of people that claimed to achieve more than 1M concurrent websocket connections on a node.js server and more than 100K on a pure socket.io one. So, with a bit of tweaking, it is probably achievable to have around 50k concurent websocket connections on a single node soket.io (on a feathers.js ;) ) server if you have enough CPU power and RAM.
That beeing said, having a channels system in feathers.js that would work for both socket.io and Primus would be a real benefit I think (appart from the performance improvement, it would also probably ease many usage case that currently require to define filters)

I guess we'll only really know what is possible with a reproducible benchmark of a somewhat real-life app 😄

Now to my actual API proposal for all of this:

1) Events and event filtering as hooks

Refactor event emitting and filtering into a hook. @ekryski and I talked about this before. Basically feathers-hooks will become a dependency of core and event emitting becomes a hook that always runs last.

2) app.channel

Add an app.channel method that lets you stick a connection (the event emitter) into a "channel" keyed by a string:

const updateChannels = (connection, user) => {
  // A channel just for this user
  app.channel(user._id, connection);

  // For each user room add the socket to the room channel
  user.rooms.forEach(roomId => app.channel(`rooms/${roomId}`, connection));

  // Register the socket in the `company/<company_id>` channel
  app.channel(`company/${user.company_id}`, connection);
};
app.on('connection', updateChannels);
app.on('login', updateChannels);

The tricky part here is to figure out a way to keep those channels in sync when the user is updated, e.g. leaving a room while still being connected.

3) Updated event filtering

Change the event filtering to only run once per event. This means it will pass all connections instead of running for each individual connection. The filter then returns the connections to which the event should be dispatched. You can quickly (O(1)) grab connections based on their channel id:

app.service('messages').filter('eventname', (message, connections, hook) => {
  // Just dispatch to one user
  if(message.isPrivate) {
    return connections.channel(message.receiver_id);
  }

  // The message room channel
  return connections.channel(`rooms/${message.room_id}`);

  // EVERYBODY!
  return connections;

  // Filter connections manually, e.g. if the connection user and message user are friends
  return connections.filter(connection => connection.user.friends.indexOf(message.user) !== -1);
});

This will be a breaking change. Event filters will no longer be able to modify the data that are being dispatched. Modifying the event data can be done in hooks setting either hook.result or hook.dispatch (#376 is related). Chaining may still be possible to narrow down the connections but returning a channel will always return all connections in that channel. While we're at it we might as well make event filters mandatory.

Seems good to me.
Concerning the way to keep channels in sync. I think it is the responsibility of the developer to take charge of it. It should not be automatic (except if the user disconnected, in which case, of course, his connection should be automatically removed from each channel it was part of). So if you have some kind of app.unchannel() function you should be able to call it into a service hook. That way, if a user has leaved a room, from your patch or update user service for example, you should call app.unchannel(rooms/${roomId}, connection) to remove his connection from the room(s) he has leaved (or, better, you could directly call this function in a service used to subscribe/unsubscribe a room if you have one rather than handling it through the 'user' service).

Great discussion guys and thanks for the kicking it off @ramsestom! @daffl I agree 💯 with everything you proposed there and @ramsestom we'd love help benchmarking stuff as this progresses for sure. I have a few example repos to wrap up with individual standalone services and it would be awesome to get some benchmarks on them once they are done. 😄

The tricky part here is to figure out a way to keep those channels in sync when the user is updated

I agree. Thinking out loud here, how is this channel management going to work across a cluster of services/apps? Is that going to change anything?

Is feathers-sync going to need to be adapted to support channels as well? My hunch is yes.

My guess is that feathers-sync would need to be adapted to support channels but it wouldn't probably be too difficult. In fact, instead of propagating a single event to all application instances you would just have to propagate (an event + a chanel_ID) to all app instances (so it is just a matter of adapting feathers-sync to add a chanel_ID parameter in data passed between cluster nodes).
Indeed, a chanel can be split between nodes of a cluster with no issue. You just have to keep consistent chanel labeling between node clusters and populate them only with connections attached to this node (so if an event end up in requesting to add connection 'c1'(=from client 1) to chanel 'A' for exemple, only the cluster node handling connection c1 would fulfill the request, others would just ignore it silently as they do not have a c1 connection attached).
So a chanel 'A' containing connections c1,c2,c3,c4 in a single node app will become chanel 'A' containing connections c1,c2 on node n1 and chanel 'A' containing connections c3,c4 on node n2 for example. Then, as long as you know that an event should be propagated to chanel A, there is no difference whether all connections of this chanel are attached to the same node or split between multiple ones.

I might be missing something but does it? The filter runs for every _connected_ client on that server. If you are just scaling the same app, all connections will still be put in the right channel and events dispatched only to the matching connected client.

I don't see a need to synchronize anything other than the service events - which is what feathers-sync already does.

It depends how fethers-sync currently work. Does it synchronize request service events or answer service events? I mean if I have a create() request with (data, params), is it that request that would be propagated to all app instances (and processed on each)? or is it the result of this request (the promises that should be returned to clients after the request has been processed) that would be propagated to all apps dispatchers?
I would have thought it would be the second option (else if a service request to create or udpate an entry in a database for example, it would cause issues as each app instance would try to perform it if they all process the request event...)

As I understood it, the pipeline in feathers is this one:

---------------------- performed only on the node instance receiving the request
request event
before hooks
service method itself
after hooks
result event
------------------ <- This is at this point that it would synchronized between nodes with feathers-sync
------------------ performed on each node instance
filters
dispatcher

If this is the result event that is synchronized with feathers-sync, then yes you would probably need to modify it. With the current filters implementation, as you are iterating through all connections, you can perform the filters on each connection of each node. So you don't have to provide anything else than the event to be dispatched. But if you have channels, the event now need to be dispatched on a selected set of connections. As you do no longer use filters to select this set (based on some test function), you need to pass the channel information along with the event to know on which connections of your node the event must be emitted (but this channel information can be integrated as a property of the event object itself of course, in which case you just have to pass an 'event' object as currently done, and look at this 'channel' property of you event in your dispatcher to emit it on appropriate connections. It depends how you would implement it).

This is the current final proposal of what will be implemented for this very shortly:

Terminology

  • channel is an object that holds any number of connections, potentially the data to dispatch and is usually registered under a certain name
  • connection is the Feathers specific information object on a bi-directional connection and contains information like the connected user. In the current implementations for Socket.io this would be socket.feathers and for Primus socket.request.feathers
  • dispatch is a function that returns a channel (and the data it should send) based on the event and/or hook data

Channels

app.channels, app.channel(... names)

Will either create and return a new channel for the given name or combine multiple channels into one

// A list of all channel names
app.channels

// A channel only for admins
app.channel('admins')

// A channel for message room with id 2
app.channel('rooms/2')

// A combined channel for admins and rooms/2
app.channel('admins', 'rooms/2');

// All channels
app.channel(app.channels)

Joining/leaving

.join(connection) and .leave(connection) can be called on a channel for a connection to join and leave a channel:

// Join the admin channel
app.channel('admins').join(connection);

// Leave the admin channel
app.channel('admins').leave(connection);

// Leave a channel conditionally
app.channel('admins', 'rooms').leave(connection => connection.userId === user._id);

// Leave all room channels
const roomChannels = app.channels.filter(channel => channel.indexOf('room/') === 0);

app.channel(roomChannels).leave(connection);

// Leave all channels
app.channel(app.channels).leave(connection);

Event dispatching

Event filters determine what channels an event should be sent to. Multiple event filters can be registered to send data to multiple channels. If a connection is in multiple channel, the event will only be sent once (if the data is the same, we might have to figure out how to do that).

// Handle a certain event for all services
app.dispatch('eventname', (message, hook) => {
  // Just dispatch to one user
  if(message.isPrivate) {
    return app.channel(message.receiver_id);
  }

  // Returning falsy or nothing will do nothing
});

// Handle a certain event for a specific service
app.dispatch('servicename', 'eventname', (message, hook) => {
  // Just dispatch to one user
  if(message.isPrivate) {
    return app.channel(message.receiver_id);
  }

  // Returning falsy or nothing will do nothing
});

// Send to a certain room
app.dispatch('messages', 'eventname', (message, hook) => {
  return app.channel(`rooms/${message.roomId}`);
});

// EVERYONE
app.dispatch('messages', 'eventname', (message, hook) => {
  return app.channel(app.channels);
});

// Filter connections manually, e.g. if the connection user and message user are friends
// This works similar to the old event filters
app.dispatch('messages', 'eventname', (message, hook) => {
  return app.channel(app.channels).filter(connection => connection.user.friends.indexOf(message.user) !== -1);
});

Note: The hook object for custom events will contain { service, app, path, event }.

It can also determine with which data:

app.dispatch('messages', 'eventname', (message, hook) => {
  const modifiedMessage = cloneAndModify(message);

  return app.channel(`rooms/${message.roomId}`, `rooms/general`).send(modifiedMessage);
});

Keeping channels updated

Some examples for how to update channels when the user or login status changes:

Anonymous users

app.on('connection', connection => {
  app.channel('anonymous').join(connection);
});

app.on('login', (payload, meta) => {
  const connection = meta.connection;

  // Connection can be undefined e.g. when logging in via REST
  if(connection) {
    // Leave anonymous channel first
    app.channel('anonymous').leave(connection);

    // Get the user object and stick into channels
    app.service('users').get(payload.userId).then(user => {
      // A channel just for this user
      app.channel(`users/${user._id}`).join(connection);

      // Put user into the chat rooms they joined
      user.rooms.forEach(roomId => {
        app.channel(`rooms/${roomId}`).join(connection);
      });
    });
  }
});

When the user is changed

app.service('users').on('patched', user => {
  // Find all connections belonging to this user
  app.channel(app.channels).leave(connection => connection.user._id === user._id)

  // Re-add the user to their channels  
  // A channel just for this user
  app.channel(`users/${user._id}`).join(connection);

  // Put user into the chat rooms they joined
  user.rooms.forEach(roomId => {
    app.channel(`rooms/${roomId}`).join(connection);
  });
});

Will the connection object be available in a hook? We'll need a way to manage channel membership in a hook. Should socket connections have context.connection in the hook object?

// Join the admin channel
app.channel('admins').join(connection);

// Leave the admin channel
app.channel('admins').leave(connection);

Well the connection object for sockets is what gets merged into the service method calls params (it is not the same object though). What would you need the connection for in a hook?

The one thing I could see is that it would make it easier to not send the event to the user that called the method but we have always been advising against doing that.

For example, when a user creates a new room in a chat app, where will the code for creating the channel and joining the user's connection to that room's channel reside?

That is a good question. I would say by default we can put some templates in the user services setup file that currently loads and sets the filters.

Maybe it'll be a little bit tricky to make it possible to do channel membership in a hook, since the connection data is the same as params, but it seems like it would be very natural to want to do this in a hook.

Another thing that we haven't covered is why it's not possible to do socket event emitting in a hook. It would be nice to be able to app.channel('admins').dispatch({...}) in a hook. You were saying something about why it wouldn't work, but I didn't catch it.

For channel membership inside hooks, the solution could be keep a synchronized dictionnary map of {user_id => connection} somewhere at the root of the feathers app each time a user login or logout.
This way, join() and leave() could be called from inside a hook using map[params.user.id] without having to pass the connection object to hooks directly.

As for the possibility to dispatch custom events at any point of the app (not just service events, but basically any custom event with custom data), I agree it would be great to be able to do something like app.channel('admins').dispatch({data to dispatch}) in a hook

Another thing that we haven't covered is why it's not possible to do socket event emitting in a hook. It would be nice to be able to app.channel('admins').dispatch({...}) in a hook. You were saying something about why it wouldn't work, but I didn't catch it.

Agreed. This is something I'm a bit fuzzy on as well @daffl. I think we'd like to be able to do that.


For example, when a user creates a new room in a chat app, where will the code for creating the channel and joining the user's connection to that room's channel reside?

We might need to do something like what @ramsestom suggested. I'm thinking that you might want to have a convenience method to get a socket connection by either socket id or entity id (ie. a user) attached to that socket. Although doable this might start to get a little hairy when dealing with multiple app instances and a bunch sockets coming and going.

However, now that I think about this a bit more to cover @marshallswain's use case could/should that not be 2 API calls?

  • One to create the channel
  • The other to "join" the channel

Thinking on this a bit more.... 2 API calls does work but only if you know the channel that you can join. Basically a user can only add herself to a channel. If you want to do something like what Slack does where you can add another user to a channel then it might not be possible...

I guess you have 2 options in that case:

  1. A user adds you to a channel you don't know about. Rather than actually getting added, you get an invite to a channel and confirm it to "join" the channel. Therefore, you are still adding yourself.

  2. A user actually adds you to the channel. In this case you would need a way to find a connection by entity (user) id.

Getting in the weeds a bit...

I think this stuff is doable without a user id -> connection mapping and with the proposal that @daffl put forward. This is also why it will be good to do a pre-release and put this into prod/staging in a couple apps to see where the edge cases lie.

Maybe we do a first cut of this without the mapping? What do ya'll think?

Agreed. This is something I'm a bit fuzzy on as well @daffl. I think we'd like to be able to do that.

Since @marshallswain and @ekryski mentioned it. I think it is fine to emit events to your own (and maybe another) service in a hook but you should not be able to send stuff to connected clients or channels directly from a hook. I think that would undermine the separation of services (that have methods and run hooks) from the event system for sending real-time data.

Just like we say services should be transport independent, I also think they should be "connection unaware" as in, a service itself (and its hooks) shouldn't have to worry about who is connected to the server it is running on. I realize that that's not entirely true because we are still registering filters/dispatchers via app.service('myservice').dispatch on the service but at least it is only for that specific purpose whereas a hook can do anything.

For example, when a user creates a new room in a chat app, where will the code for creating the channel and joining the user's connection to that room's channel reside?

Isn't that exactly what

app.service('users').on('patched', user => {
  // Leave all channels belonging to a user
  app.channel(app.channels).leave(connection => connection.user._id === user._id)

  // Re-add the user to their channels  
  // A channel just for this user
  app.channel(`users/${user._id}`).join(connection);

  // Put user into the chat rooms they joined
  user.rooms.forEach(roomId => {
    app.channel(`rooms/${roomId}`).join(connection);
  });
});

Would do assuming that a user gets patched when they join a room?

@daffl Yup that would work. 😄

@daffl @ekryski @marshallswain i did solve exactly that scenario maybe more simple with a dynamic feathers service:

i have created a service that registers a custom message service on a custom name space like /channel/name now in the client i simply connect to the same /channel/name service and i have messaging

Todo

  • create a feathers service on server that listens to /channel/:name

    • creates channel message service if not already registered for /channel/:name

  • create if socket.io or primus is used on init hook that filters /channel* messages and again creates the /channel/name service
  • on client simply register and use /channel/name service

Hope that helps you i use that to connect diffrent devices in diffent rooms in a bigger deployment that uses realtime

this is also the most scale able as you know you can send each channel name later also to diffrent boxes like kafka deployments that handle easy some 100 million messages per sec

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue with a link to this issue for related bugs.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

arkenstan picture arkenstan  ·  3Comments

jordanbtucker picture jordanbtucker  ·  4Comments

andysay picture andysay  ·  3Comments

Vincz picture Vincz  ·  4Comments

codeus-de picture codeus-de  ·  4Comments