Riot: IEEE802.15.4: HW Auto ACK considered harmful

Created on 9 Dec 2019  ·  7Comments  ·  Source: RIOT-OS/RIOT

Description


We have most of the radios configured with the Auto ACK feature. This means, the radio generates an ACK packet when it receives a valid IEEE802.15.4 packet (even before the packet is fetch from the radio). In most cases, this practice can be harmful.

Sometimes radio receive a packet but it get lost before being processed by the MAC layer. E.g

  • Pkt buffer is full
  • The radio switches to TX mode when receiving (https://github.com/RIOT-OS/RIOT/pull/11256)

In both cases the radio sends an ACK packet to the sender (a.k.a "all good, my MAC layer received the packet") but the packet was not received by the receiver.
This can produce weird behaviors, since the sender's MAC layer believes the packet was received and processed.

Note that hardware frame retransmissions are OK and can be used without any issues.

Thus, I propose to leave Auto ACK as optional and implement ACK response in software. Besides having a more reliable L2, we would automatically add ACK features to radios that don't provide Auto ACK caps.

network RFC stale

Most helpful comment

Maybe I am mistaken but the solution seems simple:

  • We implement a (hardware independent) ARQ -> we want that regardless of this discussion.
  • We compare performances (and interoperability) of hardware vs software implementations.
  • We deside for a platform specific default setting, which does not prevent from compiling the other.

As a side node: The fact that we never faced the problem highlighted by @jia200x can be related to missing MAC-layers on top a radio in RIOT. Furthermore, with low-power lossy radios we tolerate certain loss anyway.

All 7 comments

In most cases, this practice can be harmful.

Realistically speaking, how often does this cause actual harm? What percentage of messages get ack'ed while actually discarded by the MCU?

The radio switches to TX mode when receiving (#11256)

This case is unique to radios with a shared receive/transmit buffer right? So only affects the at86rf2xx class of devices

This can produce weird behaviors, …

Do you have some examples where this causes issues?

Thus, I propose to leave Auto ACK as optional and implement ACK response in software

Do you have some numbers on the impact on this? Flash size/RAM usage, but also power usage as the MCU has to wake up longer. Also how much time is there between frame reception and ack timeout and is it realistic to handle ACK's in software if the radio is connected over SPI?

Besides having a more reliable L2

I'm a bit skeptical about this claim (if that wasn't clear yet from my comment), handling L2 acks is a soft real-time case (missing deadlines will degrade service) and handling them in software puts a hard requirement on the real-time behaviour of RIOT.

I don't mind software L2 acks if it is a hard requirement for some specific MAC layers, but that's not mentioned from the description here.

Hi @bergzand

Realistically speaking, how often does this cause actual harm? What percentage of messages get ack'ed while actually discarded by the MCU?

For the upper layers it's not thaaaaat a big of a deal (considering they are usually best efforts).
However, some MAC and sub-MAC layers (OpenThread, TSCH) use the ACK for updating neighbor information.

This case is unique to radios with a shared receive/transmit buffer right? So only affects the at86rf2xx class of devices

True. I'm just pointing out scenarios where this might happen.

Do you have some examples where this causes issues?

Giving wrong information to users of the MAC layer (e.g Mesh Link Establishment) would probably affect the link quality. I'm aware though we don't have such a thing in RIOT yet (and stacks that use it already implement the feature).

Do you have some numbers on the impact on this? Flash size/RAM usage, but also power usage as the MCU has to wake up longer. Also how much time is there between frame reception and ack timeout and is it realistic to handle ACK's in software if the radio is connected over SPI?

Unfortunately not. Since it's not implemented, I don't have numbers to compare. I only have some test branches with auto ACK, but I didn't test deeper.

I'm a bit skeptical about this claim (if that wasn't clear yet from my comment), handling L2 acks is a soft real-time case (missing deadlines will degrade service) and handling them in software puts a hard requirement on the real-time behavior of RIOT.

There's some information from Tiny OS that software ACK tend to have higher drops (https://vs-git.informatik.uni-kl.de/engel/tinyos/blob/020c6a6d8cc542bf58ca6afb8b1bf24efbe381de/doc/txt/tep126.txt) but at least there aren't false positives.
AFAIK the IEEE802.15.4 doesn't talk about hard constrains (only about timeout values). If an ACK packet is not delivered in time it's assumed as lost. But as said before,I have no information about how well the OS performs with software ACKs. Would be interesting to have some benchmarks

I don't mind software L2 acks if it is a hard requirement for some specific MAC layers, but that's not mentioned from the description here.

We need to implement software ACK anyway for radios that don't support Auto ACK and features that are not available in hardware accelerators (I'm not aware of radios that support Enhanced ACKs to be honest).
The question is, do we want this to be optional or mandatory for our default configurations? As described before, the idea is not to remove Auto ACK support but to have better support of L2. We can still enable Auto ACK if we want (that's why I proposed to add radio caps in https://github.com/RIOT-OS/RIOT/pull/11473)

Maybe I am mistaken but the solution seems simple:

  • We implement a (hardware independent) ARQ -> we want that regardless of this discussion.
  • We compare performances (and interoperability) of hardware vs software implementations.
  • We deside for a platform specific default setting, which does not prevent from compiling the other.

As a side node: The fact that we never faced the problem highlighted by @jia200x can be related to missing MAC-layers on top a radio in RIOT. Furthermore, with low-power lossy radios we tolerate certain loss anyway.

How important are software ACKs for 6TiSCH? IIRC, OpenWSN uses software ACKs partly to track link quality among neighbors.

How important are software ACKs for 6TiSCH? IIRC, OpenWSN uses software ACKs partly to track link quality among neighbors.

6TiSCH doesn't speak against hardware ACKs, but requires Enhanced ACKs to pass timing information. In OpenWSN, this is handled by the pkg itself.

*and Enhanced ACKs are not supported by the hardware that I know of.

Furthermore, the problem with hardware acknowledgment capabilities in some cases is that it takes full responsibility of the mechanism without necessarily exposing relevant information, such as received ACK or number of retries. With 6TiSCH you might re-transmit a frame in an other cell, which requires this kind of information. Thus, OpenWSN bases on a limited radio device feature set and implements all other MAC components in software.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions.

Was this page helpful?
0 / 5 - 0 ratings