C-toxcore: Document questions and answers related to crypto and p2p networking

Created on 5 Jan 2017 · 10Comments · Source: TokTok/c-toxcore

How can i prevent leaking my traffic to nodes?
Nodes should be used only for helping clients find each other. Not for delivering data.
It's look like a MITM. It isn't better than Skype with Microsoft servers.

Source

MrSorcus

Most helpful comment

I understand how this point can feel incorrect, so I've filed a documentation issue to improve our presentation of that. Thanks for explaining your thoughts.

To quickly explain your specific concern, I'll paraphrase it. Please let me know if I misunderstood:

Q: _I am concerned about the fact that data packets coming from my computer are not delivered directly to my friend's computer, but are sometimes (or all the time, depending on network conditions) relayed via third party computers._

A: First, consider a packet going directly from your computer to your friend's computer, assuming you're on a local wifi:

Your computer creates an encrypted Tox packet (see encryption details), which it wraps in a UDP packet containing the source/destination ports (random/33445), then in an IP packet containing the source/destination IP addresses (you and your friend), then a Wireless frame containing the source/destination MAC addresses (you and access point(s)).
This Wireless frame is sent over an AES encrypted channel (assuming WPA2).
The wireless access point as well as every other user of the network receives it. Other users usually ignore the packet, but don't have to. Reading packets that are not meant for you is called sniffing. This is the first point at which others can read the packet.
The wireless access point is connected to a WAN (internet) router that establishes a connection to the routers of your internet service provider via some means (acoustic coupling, ISDN, ADSL, Cable, Satellite, fibre, ...). This connection can be encrypted (some satellites) but usually isn't. At this point, the WAN router you've sent your packet to will create an Ethernet frame containing its own MAC address and the MAC address of the next router it connects to, which is your ISP router. In transit, the IP and MAC addresses are likely unencrypted and can be read by anyone sniffing the fibre optics between you and the ISP. This is the second point at which others can read the packet.
Once the packet arrived at the ISP, it will go through various internal systems in their data centres, possibly encrypted or unencrypted at various points in time. The ISP router will then decide the next router it should send the packet to, determined by the IP address. To do this, it will unwrap the Ethernet frame it received and wrap it in a new one and send it off to the next router. At any point during processing at the ISP, data centre or system administration employees of that ISP can access your packet. Third point of access.
Now your packet will be hopping from router to router (try traceroute $your_friend_ip to see where it goes), each of which has free access to the Ethernet frame, the IP packet, and the UDP packet as well as its contents. Many points of access, let's call it the 4th.
Then your friend may be in a similar situation, in a local wifi in a different place. Again their router and members of the wireless network can read the packet. Fifth point of access.
Only now, finally, your packet arrives at your friend's computer. It receives a Wireless frame, unwraps it to find an IP packet, unwraps it to find a UDP packet, unwraps it to find a Tox packet, which is then processed by the Tox protocol implementation. This processing involves decrypting and decoding the packet and acting on it, usually resulting in your friend's client displaying the message, playing audio or video, or perform some other application level activity.

As you see, there are many points during the "direct" transmission from you to your friend where the packet can be inspected by arbitrary people. End-to-end encryption means that at no point between you and your friend can anyone read the actual contents you intended to convey. They can always only see the encrypted data.

Now, adding a TCP relay in the middle will simply lengthen the route (it could theoretically shorten it, but that's not likely). Anyone running the relay can read the packet, just like anyone else between you and your friend. The Tox crypto protocol ensures that your communication is secure.

Now, I also see a second concern:

Q: _What happens if one of the nodes relaying my data is evil?_
A: Tox selects a number of TCP relays that it can use to communicate in case direct UDP connections are not possible (e.g. due to NAT or firewall). Evil relays can do very few things to do evil:

They can choose not to send the packet. In this case, Tox will retry via a different relay. Only if all bootstrap nodes are evil will a transmission fail.
They can send a modified packet. Thanks to message authentication codes, any tampering with the data is likely to be detected. If the sender manages to tamper in a way that is undetected by software, the decrypted packet will be garbage, and the application layer (the Tox protocol decoder) will discard it. This therefore has the same effect as not relaying the packet at all.

That's basically it. In no case can the evil relay read your data. It can only choose not to relay, and only if every bootstrap node is evil, you can't communicate. This would be pretty annoying, and we would be unhappy about it, but nobody's information is compromised at any point.

I hope this clarifies some things. I have not proofread this reply, but I'll make sure it's properly represented on the website for future reference. Let me know if you have any other concerns. Thanks again for bringing this up.

iphydf on 5 Jan 2017

👍7

All 10 comments

You are mistaken. DHT bootstrap nodes exist to facilitate joining the DHT. If you'd like further explanation, you can look over this article: https://en.wikipedia.org/wiki/Distributed_hash_table. Alternatively, you can join us in IRC, in the #tox channel on Freenode, and we'll try to explain things as best we can.

Zer0-One on 5 Jan 2017

👎1 👍1

Your traffic may go through bootstrap nodes acting as TCP relays while tox uses TCP for the friend connection. It's similar to TURN. Your traffic is still end-to-end encrypted, so the confidentiality and authenticity of your messages are never compromised. This TCP relaying is probably what you saw in your traffic analysis. It is described in detail in the tox protocol specification.

iphydf on 5 Jan 2017

utox-inline_1

It's a voice traffic. It's encrypted and now third parties can't decrypt this, but it doesn't mean that it's impossible in the future.

I have direct IPv4/IPv6 address. Why i should send my data to nodes?
You say - 'DHT bootstrap nodes exist to facilitate joining the DHT.', but it's not true. In attached screenshot traffic passes through a nodes, not directly to me.

'Tox is easy-to-use software that connects you with friends and family without anyone else listening in.' - it's a lie? Traffic encrypted, ok. But it's using nodes for delivering? I don't know, who maintained this nodes. What if one or more nodes is false?

MrSorcus on 5 Jan 2017

I understand how this point can feel incorrect, so I've filed a documentation issue to improve our presentation of that. Thanks for explaining your thoughts.

To quickly explain your specific concern, I'll paraphrase it. Please let me know if I misunderstood:

A: First, consider a packet going directly from your computer to your friend's computer, assuming you're on a local wifi:

Your computer creates an encrypted Tox packet (see encryption details), which it wraps in a UDP packet containing the source/destination ports (random/33445), then in an IP packet containing the source/destination IP addresses (you and your friend), then a Wireless frame containing the source/destination MAC addresses (you and access point(s)).
This Wireless frame is sent over an AES encrypted channel (assuming WPA2).
The wireless access point as well as every other user of the network receives it. Other users usually ignore the packet, but don't have to. Reading packets that are not meant for you is called sniffing. This is the first point at which others can read the packet.
The wireless access point is connected to a WAN (internet) router that establishes a connection to the routers of your internet service provider via some means (acoustic coupling, ISDN, ADSL, Cable, Satellite, fibre, ...). This connection can be encrypted (some satellites) but usually isn't. At this point, the WAN router you've sent your packet to will create an Ethernet frame containing its own MAC address and the MAC address of the next router it connects to, which is your ISP router. In transit, the IP and MAC addresses are likely unencrypted and can be read by anyone sniffing the fibre optics between you and the ISP. This is the second point at which others can read the packet.
Once the packet arrived at the ISP, it will go through various internal systems in their data centres, possibly encrypted or unencrypted at various points in time. The ISP router will then decide the next router it should send the packet to, determined by the IP address. To do this, it will unwrap the Ethernet frame it received and wrap it in a new one and send it off to the next router. At any point during processing at the ISP, data centre or system administration employees of that ISP can access your packet. Third point of access.
Now your packet will be hopping from router to router (try traceroute $your_friend_ip to see where it goes), each of which has free access to the Ethernet frame, the IP packet, and the UDP packet as well as its contents. Many points of access, let's call it the 4th.
Then your friend may be in a similar situation, in a local wifi in a different place. Again their router and members of the wireless network can read the packet. Fifth point of access.
Only now, finally, your packet arrives at your friend's computer. It receives a Wireless frame, unwraps it to find an IP packet, unwraps it to find a UDP packet, unwraps it to find a Tox packet, which is then processed by the Tox protocol implementation. This processing involves decrypting and decoding the packet and acting on it, usually resulting in your friend's client displaying the message, playing audio or video, or perform some other application level activity.

Now, I also see a second concern:

They can choose not to send the packet. In this case, Tox will retry via a different relay. Only if all bootstrap nodes are evil will a transmission fail.
They can send a modified packet. Thanks to message authentication codes, any tampering with the data is likely to be detected. If the sender manages to tamper in a way that is undetected by software, the decrypted packet will be garbage, and the application layer (the Tox protocol decoder) will discard it. This therefore has the same effect as not relaying the packet at all.

iphydf on 5 Jan 2017

👍7

I just read your message again, and discovered that I missed one more concern:

Q: Although data is encrypted now, what ensures that it won't be decrypted in the future?
A: The Tox protocol implements perfect forward secrecy through the use of ephemeral keys. This means that if one of those keys is compromised, a few messages can be decrypted, but not your entire communication history. The "a few messages" part of this sentence will be reduced to "one message" in the future. If your long term secret key is compromised, no past communication can be decrypted.

If the cryptographic primitives we use are broken, we lose. It depends in which way they are broken, these are possible worst-case scenarios:

The cipher (salsa20) can be easily reversed without a key. In this case, all past communication is compromised.
The key exchange primitive (curve25519) is broken in a way that the secret key can easily be recovered given a public key. In this case, also all past communication is compromised.

These scenarios are very unlikely to become reality in the near future, or possibly forever. By current understanding in the cryptography community, only quantum computing could make the second scenario happen. The first scenario is believed to be impossible.

This all said, I also noticed that you said you have a direct IPv4 address. What does this mean? If you have a public IPv4 address assigned to your computer, and port 33445 is open, Tox should establish direct connections very quickly. If it doesn't, that's a bug and we should work together to find out why it chooses to use TCP instead.

iphydf on 5 Jan 2017

👍3

Thanks a lot for this explanation. Now i understand a little more.
I'm not sure about direct IPv4 address...I'm use WireGuard VPN. WireGuard installed on virtual server, that has direct IPv4 & IPv6 address. All the traffic is wrapped in namespace.
Laptop network info: https://gist.github.com/DebugReport/1268e15c3bd1c99b56929d645d99392b
If i was mistaken, i'm sorry.
Maybe IPv4 isn't direct, but what about IPv6? I can use direct connections if other client has IPv6 too?

MrSorcus on 5 Jan 2017

Yes, if both parties have IPv6, and the firewall configuration doesn't block port 33445 (or some other port near that, something between 33445 and 33545), it should work. Is your friend in the same VPN?

iphydf on 5 Jan 2017

No.
Hmm...Question. We need to use nodes always? Or only if one of us doesn't have direct IP (IPv4 only?)?
For IPv6 (me) <-> IPv6 (friend) whether are nodes neccessary? If yes - why?

MrSorcus on 5 Jan 2017

(Keeping this issue open until all these questions are answered in documentation)

If one of you has a public IP, then the other one can bootstrap using the other's IP and port. This requires client support I don't think any client currently has:

Get the DHT public key from the client with public IP and open port (tox_self_get_dht_id) and its port (tox_self_get_udp_port).
Send this key to the other one somehow (dictate via phone or message on Skype or something).
The other one now needs to bootstrap using the (key, ip, port) tuple.

After this, you have a personal 2-people Tox network. So, in theory you don't need any other nodes. They do make things easier, though.

If one of you has a public IP and open port, then connecting to bootstrap nodes should also let you establish a direct connection. DHT bootstrap nodes have little to do with whether you can connect or not. A direct connection should be possible even if only one of you has an public IP and open port. The other one would connect to it, which would create a route in the local router and give the client a temporary random public port.

iphydf on 5 Jan 2017

Just a note: I noticed the same behavior with C-Toxcore. One of the parties is on a VPS with a public IP address and no firewall, the other is behind NAT but has the Tox port forwarded - so they should be mutually reachable. Traffic was still routed via TCP.

I don't see this as a security issue, but it's certainly a scalability issue if a P2P network is relaying all its traffic via relays.