C-toxcore: Can't send messages in persistent group chat

Created on 30 Jun 2018  ·  9Comments  ·  Source: TokTok/c-toxcore

With persistent groups it sometimes happens that I can't send messages in group chat I'm connected to. Unfortunately I don't know how to reproduce this bug and it happens randomly. I appear as online and I can message people, but I am not able to send messages to a group or groups. Normally if I lost connection to a group, I should just be reconnected, but here that never happens. The only way for me to be able to send messages again is to manually leave the group and join it again or restart the client.

Most helpful comment

A was in the call, crashed,

Thanks. That was the hint I needed. I believe I see the problem now:
messages from a particular peer come with a "message number",
incremented with each message from that peer, and messages with too old
a message number are ignored by other peers. The current message number
is saved in the savedata, but if there's a crash and the peer restarts
from an old save file, they'll start from an old message number and so
other peers will ignore them. In particular they ignore the kill message
that the peer sends if it leaves the group, which explains why leaving
and rejoining doesn't help - the other peers don't even realise the peer
left.

All this goes for audio too (there's a separate message number for lossy
packets).

This may take some thought to fix properly.

All 9 comments

I could reproduce this bug and noticed that I can still receive messages of some people in the group.

Closing because we got rid of the old PGC PR.

I just hit this issue running qTox with toxcore v0.2.9. I loaded a persistent group that I've loaded a few times before and was unable to send messages or set title. I still receive peer messages. Restarting qTox doesn't fix the issue. Other groups still work fine. There's no log on message send or title set.

Since the group seems to be stuck this way, I'd be happy to dig into what's going on with gdb if a core dev can give me some guidance.

We hit the same issue with a new group member, running Toxcore 0.2.9:

We have the same group chat with 3 members, A, B, and C. All three were online, but A's messages weren't delivered to B, or C, and they also didn't receive their own messages. This persisted across multiple client restarts of A.

During the chat, A saw either B or C disconnect and reconnect in the group, and B and C saw each other disconnect and reconnect in their 1-1 chat. After that point, A's messages were then delivered to A and B, but still not C.

After a while, C closed and re-opened their client, and then all of A, B, and C could see everyone's messages. @zugz this is one of the cases you asked me about on IRC IIRC. I'll reopen this issue since it's reproducible on tip and sounds like it's being looked into.

Thanks for reporting this. I'm currently wholly mystified, and haven't
managed to reproduce the bug.

Some questions to narrow down where the problem could be:

Did any of the members join or leave any other groups?

Which pairs of A,B,C were tox friends?

Am I right to understand that during the period when A's messages
weren't being sent to everyone, A nonetheless saw both B and C in the
peer list for the group (except during brief disconnections)?

I hit this again, now on v0.2.10. Sorry, will follow up on your questions now for the latest repro case. In this case, the group had 4 members. A was in the call, crashed, then started back up and rejoined the call. When they rejoined, all peers couldn't hear them, but A could hear all peers, and all peers didn't receive A's text messages, but A saw their own.

Which pairs of A,B,C were tox friends?

A was friends with B and C
B was friends with A, C, and D
C was friends with A and B
D was friends with B

Did any of the members join or leave any other groups?

No. All 4 were in a group audio call playing some games - none were doing anything tox related.

Am I right to understand that during the period when A's messages
weren't being sent to everyone, A nonetheless saw both B and C in the
peer list for the group (except during brief disconnections)?

Yes. B, C, D in this case all showed in the peer list, and A was receiving audio for all of them.

A leaving and being re-invited to the group did not fix the issue, and A restarting their client did not fix the issue. All 4 members needed to move to a new group, where things then worked.

A was using qTox which saves the tox profile pretty rarely - only on friend add/remove basically, so I don't see how the qTox crash during the call could somehow corrupt tox profile state - but possibly the crash is an important part of the repro.

A was in the call, crashed,

Thanks. That was the hint I needed. I believe I see the problem now:
messages from a particular peer come with a "message number",
incremented with each message from that peer, and messages with too old
a message number are ignored by other peers. The current message number
is saved in the savedata, but if there's a crash and the peer restarts
from an old save file, they'll start from an old message number and so
other peers will ignore them. In particular they ignore the kill message
that the peer sends if it leaves the group, which explains why leaving
and rejoining doesn't help - the other peers don't even realise the peer
left.

All this goes for audio too (there's a separate message number for lossy
packets).

This may take some thought to fix properly.

I think this ticket was accidentally closed. The root cause discovered by zugz was a few months after https://github.com/TokTok/c-toxcore/pull/1321 was opened, and based on chats this issue is still unresolved. Reopening.

Ah, whoops. Yeah, the commit message in #1321 had "possibly fixes x" and "fixes x" is some kind of magic GitHub thing that auto-closes the issue mentioned when merged.

Was this page helpful?
0 / 5 - 0 ratings