Linux: wlan freezes in raspberry pi 3/PiZeroW (Not 3B+)

Created on 11 Mar 2016  ·  477Comments  ·  Source: raspberrypi/linux

I put the same sd card (running debian 8 jessie, kernel 4.1.19) from the raspberry pi 2 with usb wifi (EDIMAX EW-7811UN Wireless USB Adapter, 150 Mbit/s, IEEE802.11b/g/n) into the new raspberry pi 3 using integrated wlan. Since then the wlan freezes after while (several hours) of usage couldn't find out if it's due to havy wifi usage or not, because I haven't change the software I guess it has to do with the new hardware. When the wlan freezes the pi can't be reached any longer, neither ifdown + ifup nor restart networking service helps in this case, I have to reboot the system to get it back to work, syslog doesn't say much only this:
dhcpcd[522]: wlan0: fe80::8af7:c7ff:fece:5912: expired option 25,

I've tried to change these settings so far, but without improvement:

sudo nano /etc/network/interfaces
wireless-power off

sudo nano /etc/sysctl.conf
at the end of the file add the following line
vm.min_free_kbytes = 16384

sudo nano /boot/cmdline.txt
At the end of the line, add:
smsc95xx.turbo_mode=N
dwc_otg.dma_enable=1 dwc_otg.dma_burst_size=256

Bug Waiting for external input Wifi Issue confirmed

Most helpful comment

As an update to the issue, it seems the cause of the crash, at least in my case, is due to my Pi Zero being connected to a network that has 802.11r fast roaming enabled. If I reconnect to a non 802.11r network, I do not have connectivity issues. I have tested with roamoff=1 as well as roamoff=0, and I can always re-create the driver issue during an inbound SCP to the device. Since roamoff has no impact on the issue, this leads me to think the issue is within the brcmfmac driver on handling 802.11r networks.

All 477 comments

EDIMAX EW-7811UN.... That's using rtl8188cus chipset, IIRC.

If you haven't already got one, create /etc/modprobe.d/8192cu.conf, with content....

Disable power management

options 8192cu rtw_power_mgnt=0 rtw_enusbss=0

The rpi3 actually uses the brcmfmac driver for the inbuilt wifi
there is an issue that requires the power saving / mangement to be turned off

I think the newer raspian kernels have patched this already to disable power saving by default but I don't think it's in this 4.5 branch yet

What I'm doing at the moment (gentoo install) is the following at bootup to disable the power saving on the wifi card

iw wlan0 set power_save off

The rpi3 actually uses the brcmfmac driver for the inbuilt wifi

Yes, I know. Oh I see. He's not using the EDIMAX EW-7811UN dongle anymore. He used to use it with RPi2.

yes I don't use the usb wifi any longer, how do I set up the cmd line to turn off the power management?
crontab
@reboot iw wlan0 set power_save off

Not sure for raspian, since I'm using gentoo it'll be different

Seems to work since I have turned the powermanagement off I haven't had another wlan crash.

Just to mentioned it, to restart the wlan automatically after a crash, this here helps:
sudo cp /etc/wpa_supplicant/ifupdown.sh /etc/ifplugd/action.d/ifupdown

BTW, latest apt-get upgrade kernel has power management disabled by default.
@dh-connect does this work for you if you remove your current workaround?

it's still crashing after the latest upgrade, now i get this error in syslog:
brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!!

When you say it's crashing, are there symptoms other than the error message?

no, just the one I have posted here but it is in the log many times

the wlan stops working, i can still work with it but to get the wlan back working I have to reboot it

Thanks - I think "wlan stops working" counts as a symptom.

I've tried a few things, but wlan still breaks down

to answer the question above when I take back the configuration
wireless-power off in /etc/network/interfaces
and reboot
and check the settings with iwconfig
the power management ist turned back on so by default I wouldn't say that this is diasbled so I will leave the configuration

I tried that with kernel 4.1.19 and now also with kernel 4.1.20 ... no change

when the wlan crashed and i try to turn it back on with ifdown and ifup wlan0 I get this:
Error for wireless request "Set Power Management" (8B2C) : SET failed on device wlan0 ; Invalid exchange.

I also got a few more error in syslog:

dhcpcd[532]: wlan0: xxx: expired option 25

Mar 21 17:29:35 raspberrypi kernel: [ 6627.337503] brcmfmac: _brcmf_set_multicast_list: Setting mcast_list failed, -52
Mar 21 17:29:36 raspberrypi wpa_supplicant[6318]: Successfully initialized wpa_supplicant
Mar 21 17:29:36 raspberrypi dhcpcd[532]: wlan0: carrier lost

Mar 21 17:29:43 raspberrypi kernel: [ 6635.337616] brcmfmac: _brcmf_set_multicast_list: Setting mcast_list failed, -52

Mar 21 17:29:45 raspberrypi kernel: [ 6637.337588] brcmfmac: brcmf_do_escan: error (-52)
Mar 21 17:29:45 raspberrypi kernel: [ 6637.337602] brcmfmac: brcmf_cfg80211_scan: scan error (-52)

Mar 21 17:29:47 raspberrypi kernel: [ 6639.337596] brcmfmac: _brcmf_set_multicast_list: Setting allmulti failed, -52
Mar 21 17:29:49 raspberrypi kernel: [ 6641.337632] brcmfmac: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -52

is there anything else I could try?

also these:

Mar 21 21:26:55 raspberrypi dhcpcd[526]: wlan0: xxx: expired option 25
Mar 21 21:28:54 raspberrypi kernel: [ 1958.899715] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
Mar 21 21:30:16 raspberrypi dhcpcd[526]: wlan0: xxx is unreachable, expiring it

I'm not surprised that iwconfig thinks the device has power-saving enabled - I blocked it within the driver itself, and either the state is saved in the higher layers or there is another change required in order to report it correctly. Either way, the evidence is strong that we have avoided the power-saving bugs, but some other problems still remain.

Do you have any rough figures for the time-to-failure and roughly how much data might have been transferred (from ifconfig)?

yes I do, when I have just the webserver running with not much traffic (less than 100 MB) it lasts a day or two, when i transfer large data files like 1 GB wlan crashes within 1 hour

anything I can provide to help to find the bug?

here are some error from syslog:

Mar 29 14:20:56 raspberrypi dhcpcd[535]: wlan0: xxx: expired option 25
Mar 29 14:30:15 raspberrypi dhcpcd[535]: wlan0: xxx is unreachable, expiring it
Mar 29 17:18:42 raspberrypi kernel: [186148.102420] brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!!
Mar 29 17:18:43 raspberrypi kernel: [186149.101045] brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!!
Mar 29 17:18:43 raspberrypi kernel: [186149.101145] brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!!
Mar 29 17:18:44 raspberrypi kernel: [186150.101209] brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!!
Mar 29 17:18:50 raspberrypi wpa_supplicant[478]: wlan0: CTRL-EVENT-DISCONNECTED bssid=xxx reason=3 locally_generated=1
Mar 29 17:18:50 raspberrypi kernel: [186156.181033] brcmfmac: brcmf_cfg80211_disconnect: error (-52)
Mar 29 17:18:52 raspberrypi kernel: [186158.181028] brcmfmac: send_key_to_dongle: wsec_key error (-52)
Mar 29 17:18:54 raspberrypi kernel: [186160.181046] brcmfmac: send_key_to_dongle: wsec_key error (-52)
Mar 29 17:18:56 raspberrypi kernel: [186162.181048] brcmfmac: send_key_to_dongle: wsec_key error (-52)
Mar 29 17:18:58 raspberrypi kernel: [186164.181049] brcmfmac: send_key_to_dongle: wsec_key error (-52)
Mar 29 17:18:58 raspberrypi kernel: [186164.185477] cfg80211: Calling CRDA to update world regulatory domain
Mar 29 17:18:58 raspberrypi dhcpcd[535]: wlan0: carrier lost
Mar 29 17:18:58 raspberrypi wpa_supplicant[7354]: Successfully initialized wpa_supplicant
Mar 29 17:18:58 raspberrypi kernel: [186164.314511] brcmfmac: brcmf_cfg80211_reg_notifier: not a ISO3166 code
Mar 29 17:18:58 raspberrypi kernel: [186164.314541] cfg80211: World regulatory domain updated:
Mar 29 17:18:58 raspberrypi kernel: [186164.314548] cfg80211: DFS Master region: unset
Mar 29 17:18:58 raspberrypi kernel: [186164.314555] cfg80211: (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp), (dfs_cac_time)
Mar 29 17:18:58 raspberrypi kernel: [186164.314565] cfg80211: (2402000 KHz - 2472000 KHz @ 40000 KHz), (N/A, 2000 mBm), (N/A)
Mar 29 17:18:58 raspberrypi kernel: [186164.314573] cfg80211: (2457000 KHz - 2482000 KHz @ 40000 KHz), (N/A, 2000 mBm), (N/A)
Mar 29 17:18:58 raspberrypi kernel: [186164.314581] cfg80211: (2474000 KHz - 2494000 KHz @ 20000 KHz), (N/A, 2000 mBm), (N/A)
Mar 29 17:18:58 raspberrypi kernel: [186164.314592] cfg80211: (5170000 KHz - 5250000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2000 mBm), (N/A)
Mar 29 17:18:58 raspberrypi kernel: [186164.314602] cfg80211: (5250000 KHz - 5330000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2000 mBm), (0 s)
Mar 29 17:18:58 raspberrypi kernel: [186164.314611] cfg80211: (5490000 KHz - 5730000 KHz @ 160000 KHz), (N/A, 2000 mBm), (0 s)
Mar 29 17:18:58 raspberrypi kernel: [186164.314645] cfg80211: (5735000 KHz - 5835000 KHz @ 80000 KHz), (N/A, 2000 mBm), (N/A)
Mar 29 17:18:58 raspberrypi kernel: [186164.314654] cfg80211: (57240000 KHz - 63720000 KHz @ 2160000 KHz), (N/A, 0 mBm), (N/A)

Thanks for the offer, but this is in the hands of Broadcom now.

Any update from Broadcom if this is a bug which will be fixed? I now have a cron job setup to bring down and up wlan0 when it fails to ping.

quick update from my side, i could get the problem fixed seems to be driver related, i installed Ubuntu MATE 16.04 with kernel 4.4.8 and haven't had any problems with wifi since

i mean they advertise is: "Ubuntu MATE 16.04 also has fully working Bluetooth and Wifi on the Raspberry Pi 3" which seems true

maybe it also works with a new Debian release, which i can not tell

@juched78 Are you running a 4.4 kernel? If not, please run sudo rpi-update to get the latest 4.4.8 build and see if that suffers the same problem.

The Broadcom drivers have changed significantly since 4.1, and our 4.4 tree includes back-ports of some fixes that went into 4.5. I'm not aware of any outstanding bugs apart from the failure to wake from sleep (power management is still disabled) - channels 12 & 13 are usable where permitted, and Ad Hoc mode doesn't crash - but there may still be lurking issues.

Oh, there is one reported bug still in 4.4.8 - apparently heavy use of hostapd can lead to a kernel warning (see https://github.com/raspberrypi/linux/issues/1375).

I am running:
Linux XXX 4.4.8-v7+ #880 SMP Fri Apr 22 21:55:04 BST 2016 armv7l GNU/Linux

Apr 27 2016 11:06:18
Copyright (c) 2012 Broadcom
version 9b52ab7b475f4a056658fd2d95d2440b32167390 (clean) (release)

With my Netgear R7000 running Shibby Tomato, around 2 days in the wifi drops, and in the sys logs I see:

CTRL-EVENT-DISCONNECTED
brcmfmac: brcmf_link_down: WLC_DISASSOC failed (-52)
brcmfmac: send_key_to_dongle: wsec_key error (-52)
...
brcmfmac: brcmf_do_escan: error (-52)
...
wpa_supplicant[506]: wlan0: CTRL-EVENT-REGDOM-CHANGE init=CORE type=WORLD
...
brcmfmac: brcmf_cfg80211_reg_notifier: not a ISO3166 code

(then I see it scan and re-pick my country code CA)

brcmfmac: _brcmf_set_multicast_list: Setting allmulti failed, -52
brcmfmac: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -52
brcmfmac: _brcmf_set_multicast_list: Setting mcast_list failed, -52

Then it seems to never reconnect...

Using sudo ifdown wlan0 followed by sudo ifup wlan0 brings back my connection.

Just upgraded to:
Linux JuchePi 4.4.8-v7+ #881 SMP Sat Apr 30 12:16:50 BST 2016 armv7l GNU/Linux

Not sure what is all different from the 22nd to the 30th. I will monitor the connection.

My RPi 3 also hit that problem. I got few different kernel messages. Mainly one of those below.
After that I can' get the WiFi working, bringing wlan0 down then up does not help.

May 09 21:24:25 osmc kernel: brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
May 09 22:00:15 osmc kernel: brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!!
May 09 22:00:18 osmc kernel: brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!!
May 10 00:51:10 osmc kernel: brcmfmac: brcmf_cfg80211_get_tx_power: error (-52)
May 10 00:51:12 osmc kernel: brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
May 10 00:53:16 osmc kernel: brcmfmac: brcmf_do_escan: error (-52)
May 10 00:53:16 osmc kernel: brcmfmac: brcmf_cfg80211_scan: scan error (-52)

Raspberry is powered from original power adapter for version 3. I'm running latest OSMC:
$ uname -a
Linux osmc 4.4.8-3-osmc #1 SMP PREEMPT Sun May 1 18:57:43 UTC 2016 armv7l GNU/Linux

Still monitoring. I had openhab go offline after running 3 days but for some reason I could still ssh into the Pi which I usually couldn't. The top of the hour and the wifi script ran to bring down and bring up the connection and then it reconnected to my openhab org. Odd. Will keep watching.

I am also experiencing the same issue - dmesg trace as follows:

send_key_to_dongle: wsec_key error (-52)
brcmf_cfg80211_del_station: SCB_DEAUTHENTICATE_FOR_REASON failed -52
brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
brcmf_cfg80211_get_tx_power: error (-52)

Usage:

rp3 is being used as a router/access point

Connectivity length seems random - I've had as high as two weeks, and as bad as a few minutes. Lately it's been going out every 20 minutes or so. Bringing wlan0 down and back up does not resolve the issue - a full reboot is required.

Problem seems to be exacerbated while streaming Netflix from my AppleTV. Though this was not the case when I had the two weeks of uptime.

I'm on 4.4.10-v7+

I switched the channel from 13 to 6 to check if that could be the problem (there were some defects about the high channels) and since then I haven't had a WiFi freez. But that could be a coincidence...

Changing access point channels didn't help. WiFi still breaks. Last few times I had to restart few times in a row to get it working.

I experience this issue specifically when I try to do an SFTP transfer between the rpi3 and my Galaxy S5 phone. When I try to perform the same transfer from my laptop, everything runs smoothly.

Running latest kernel from rpi-update:

Linux raspberrypi 4.4.11-v7+ #888 SMP Mon May 23 20:10:33 BST 2016 armv7l GNU/Linux

Error message from syslog:

May 29 18:10:46 raspberrypi kernel: [  178.605907] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012

It seems that the only solution after this error is a reboot.

I have had mine drop off the network twice in the past week. First time I was in a rush so just unplugged and rebooted. Few days later it happened again, rebooted again and then ran full system updates (including firmware) and will monitor. Have it mounted with no monitor near by, so getting details on the error needs more effort :)

Same problem here. It always freeze when transfer big files by sftp. Just rebooting to solve

Broadcom say that #1313 is a non-issue, and the latest kernel no longer shows those messages.

I've been unable to reproduce this problem. Has anybody been able to capture a packet trace around the time of failure?

If anyone has time to do some more testing with a debug-enabled driver module it would be much appreciated:

1) Run sudo rpi-update and reboot. This is to get your kernel up to the same level as mine so that the module is compatible.

2) Download and install the updated driver module:

BRCM80211=/lib/modules/`uname -r`/kernel/drivers/net/wireless/brcm80211
BRCMFMAC=$BRCM80211/brcmfmac
wget -O brcmfmac.ko "https://docs.google.com/uc?authuser=0&id=0B8VsfKAD4-NOR1ZxWS00ZmFrR1k&export=download"
wget -O brcmutil.ko "https://docs.google.com/uc?authuser=0&id=0B8VsfKAD4-NOM0ZDd3FvYUNwZXc&export=download"
sudo mv $BRCMFMAC/brcmfmac.ko{,.orig}
sudo cp brcmfmac.ko $BRCMFMAC
sudo sh -c "echo options brcmfmac debug=0x100000 > /etc/modprobe.d/brcmfmac.conf"
BRCMUTIL=$BRCM80211/brcmutil
sudo mv $BRCMUTIL/brcmutil.ko{,.orig}
sudo cp brcmutil.ko $BRCMUTIL/brcmutil.ko

Reboot to activate the new modules.

3) Use your Pi as normal, then if your WiFi freezes run:

dmesg > wifi_freeze.txt

and upload it to your favourite pasting site (or create a Gist). One or two logs should be plenty.

To restore the original version of the module:

BRCM80211=/lib/modules/`uname -r`/kernel/drivers/net/wireless/brcm80211
sudo mv $BRCM80211/brcmfmac/brcmfmac.ko{.orig,}
sudo mv $BRCM80211/brcmutil/brcmutil.ko{.orig,}

Thanks in advance.

Hang on for a moment while we verify that the debug output really is enabled.

You will also need to enable a debug feature on the driver:

sudo sh -c "echo options brcmfmac debug=0x100000 > /etc/modprobe.d/brcmfmac.conf"

I've amended the instructions above.

After a reboot your dmesg output should include something like this:

[   10.848903] brcmfmac: CONSOLE: hndarm_armr addr: 0x18003000, cr4_idx: 0
[   10.860475] brcmfmac: CONSOLE: 000000.001
[   10.869471] brcmfmac: CONSOLE: RTE (SDIO-CDC) 7.45.41.26 (r640327) on BCM43430 r1 @ 37.4/81.6/81.6MHz
[   10.883644] brcmfmac: CONSOLE: 000000.001 sdpcmdcdc0: Broadcom SDPCMD CDC driver
[   10.896090] brcmfmac: CONSOLE: 000000.005 reclaim section 0: Returned 47716 bytes to the heap
[   10.909734] brcmfmac: CONSOLE: 000000.007 wlc_bmac_info_init: host_enab 1
[   10.921417] brcmfmac: CONSOLE: 000000.026 wl0: Broadcom BCM43430 802.11 Wireless Controller 7.45.41.26 (r640327)
[   10.936777] brcmfmac: CONSOLE: 000000.027 TCAM: 256 used: 179 exceed:0
[   10.936794] brcmfmac: CONSOLE: 000000.028 reclaim section 1: Returned 81268 bytes to the heap
[   10.936803] brcmfmac: CONSOLE: 000000.029 sdpcmd_dpc: Enable
[   10.938242] brcmfmac: brcmf_c_preinit_dcmds: Firmware version = wl0: May 27 2016 00:13:38 version 7.45.41.26 (r640327) FWID 01-df77e4a7
[   10.949404] brcmfmac: CONSOLE: 000000.125 wl0: wlc_enable_probe_req: state down, deferring setting of host flags
[   10.963663] brcmfmac: brcmf_cfg80211_reg_notifier: not a ISO3166 code
[   10.969865] brcmfmac: CONSOLE: 000000.150 wl0: wlc_enable_probe_req: state down, deferring setting of host flags
[   10.969876] brcmfmac: CONSOLE: 000000.151 wl0: wlc_enable_probe_req: state down, deferring setting of host flags
[   11.189639] brcmfmac: CONSOLE: 000000.368 wl0: wl_open

@pelwell after executing your instructions I no longer have wifi...

root@pi3b:/home/pi# dmesg | grep brcmf
[ 15.582665] brcmfmac: Unknown symbol brcmu_dbg_hex_dump (err 0)
[ 15.613709] brcmfmac: Unknown symbol brcmu_dbg_hex_dump (err 0)

Try this:

BRCMUTIL=/lib/modules/`uname -r`/kernel/drivers/net/wireless/brcm80211/brcmutil
wget -O brcmutil.ko "https://docs.google.com/uc?authuser=0&id=0B8VsfKAD4-NOM0ZDd3FvYUNwZXc&export=download"
sudo mv $BRCMUTIL/brcmutil.ko{,.orig}
sudo cp brcmutil.ko $BRCMUTIL

And reboot.

wlan0 does not associate.
wireless.txt
(in one of many reboots I saw an association for a few minutes though, not catched it (yet) in dmesg)

Seems like the issue may have been resolved for me by upgrading from 4.4.11-v7+ to 4.4.15-v7+

I tried to recreate the problems I was having with SFTP transfers from an Android phone, but I'm not seeing any problems as of right now.

@pelwell after a long wait wlan0 succeeded to associate; appended dmesg to previous log:
wireless.txt
waiting now for freeze or association-loss
hope this is helpfull

@pelwell quickly lost connection again; appended dmesg to:
wireless.txt

Thank you. It was slow for me the first time. I've been busy getting a clean Raspbian and applying the patches to try to reproduce the problem - I'll continue anyway.

@pelwell
wireless.txt
and reassociated again: appended dmesg again
do you want me to continue?

@pelwell :lost association again
wireless_associationloss.txt

@pelwell
it is switching on/off irregularly
wireless_associationloss.txt

I think you'd better switch back now before my inbox overflows.

ok; I will revert to my €3 MT7601U dongle. ;)

Thanks for your help so far,

I've just found this issue so can I confirm that it is similar to what I am seeing? I have set up a RPi 3 as an access point and every so often I am unable to connect to it. I am able to ssh in over the wired connection and I see that wlan0 is still up with the correct IP address but the only way to get the access point working again is to reboot. I see stack traces like this in /var/log/messages

Jul 16 06:57:18 raspberrypi kernel: [117621.171957] ------------[ cut here ]------------
Jul 16 06:57:18 raspberrypi kernel: [117621.172042] WARNING: CPU: 2 PID: 879 at drivers/net/wireless/brcm80211/brcmfmac/core.c:1191 brcmf_netdev_wait_pend8021x+0xe4/0xf0 [brcmfmac]()
Jul 16 06:57:18 raspberrypi kernel: [117621.172052] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables bnep hci_uart btbcm bluetooth brcmfmac brcmutil cfg80211 rfkill snd_bcm2835 snd_pcm snd_timer snd bcm2835_gpiomem bcm2835_wdt uio_pdrv_genirq uio ipv6
Jul 16 06:57:18 raspberrypi kernel: [117621.172168] CPU: 2 PID: 879 Comm: hostapd Tainted: G        W       4.4.11-v7+ #888
Jul 16 06:57:18 raspberrypi kernel: [117621.172177] Hardware name: BCM2709
Jul 16 06:57:18 raspberrypi kernel: [117621.172212] [<80018724>] (unwind_backtrace) from [<80014058>] (show_stack+0x20/0x24)
Jul 16 06:57:18 raspberrypi kernel: [117621.172235] [<80014058>] (show_stack) from [<803205a4>] (dump_stack+0xd4/0x118)
Jul 16 06:57:18 raspberrypi kernel: [117621.172259] [<803205a4>] (dump_stack) from [<80025300>] (warn_slowpath_common+0x98/0xc8)
Jul 16 06:57:18 raspberrypi kernel: [117621.172282] [<80025300>] (warn_slowpath_common) from [<800253ec>] (warn_slowpath_null+0x2c/0x34)
Jul 16 06:57:18 raspberrypi kernel: [117621.172350] [<800253ec>] (warn_slowpath_null) from [<7f23a1d4>] (brcmf_netdev_wait_pend8021x+0xe4/0xf0 [brcmfmac])
Jul 16 06:57:18 raspberrypi kernel: [117621.172466] [<7f23a1d4>] (brcmf_netdev_wait_pend8021x [brcmfmac]) from [<7f228fbc>] (send_key_to_dongle+0xa4/0xf8 [brcmfmac])
Jul 16 06:57:18 raspberrypi kernel: [117621.172579] [<7f228fbc>] (send_key_to_dongle [brcmfmac]) from [<7f229208>] (brcmf_cfg80211_del_key+0x68/0x78 [brcmfmac])
Jul 16 06:57:18 raspberrypi kernel: [117621.172723] [<7f229208>] (brcmf_cfg80211_del_key [brcmfmac]) from [<7f1742f0>] (nl80211_del_key+0xfc/0x28c [cfg80211])
Jul 16 06:57:18 raspberrypi kernel: [117621.172817] [<7f1742f0>] (nl80211_del_key [cfg80211]) from [<80505e00>] (genl_rcv_msg+0x26c/0x3f0)
Jul 16 06:57:18 raspberrypi kernel: [117621.172841] [<80505e00>] (genl_rcv_msg) from [<80504fd8>] (netlink_rcv_skb+0xb0/0xcc)
Jul 16 06:57:18 raspberrypi kernel: [117621.172862] [<80504fd8>] (netlink_rcv_skb) from [<80505b84>] (genl_rcv+0x34/0x44)
Jul 16 06:57:18 raspberrypi kernel: [117621.172883] [<80505b84>] (genl_rcv) from [<80504914>] (netlink_unicast+0x190/0x254)
Jul 16 06:57:18 raspberrypi kernel: [117621.172904] [<80504914>] (netlink_unicast) from [<80504de0>] (netlink_sendmsg+0x340/0x354)
Jul 16 06:57:18 raspberrypi kernel: [117621.172926] [<80504de0>] (netlink_sendmsg) from [<804b7c14>] (sock_sendmsg+0x24/0x34)
Jul 16 06:57:18 raspberrypi kernel: [117621.172947] [<804b7c14>] (sock_sendmsg) from [<804b82fc>] (___sys_sendmsg+0x1e0/0x1e8)
Jul 16 06:57:18 raspberrypi kernel: [117621.172968] [<804b82fc>] (___sys_sendmsg) from [<804b9054>] (__sys_sendmsg+0x4c/0x7c)
Jul 16 06:57:18 raspberrypi kernel: [117621.172988] [<804b9054>] (__sys_sendmsg) from [<804b909c>] (SyS_sendmsg+0x18/0x1c)
Jul 16 06:57:18 raspberrypi kernel: [117621.173008] [<804b909c>] (SyS_sendmsg) from [<8000fb40>] (ret_fast_syscall+0x0/0x1c)
Jul 16 06:57:18 raspberrypi kernel: [117621.173019] ---[ end trace 2d66bc66d6534ca4 ]---

My kernel is 4.4.13-v7+ and I have just run rpi-update for the first time so I don't know yet if that will help.

I wonder if this might be related, or perhaps a separate issue
https://www.youtube.com/watch?v=_D_fi_ck9Vo

My RPI3 worked without any problems via WiFi until I upgraded it to latest udev ...

Now, it doesn't connect anymore ...

I've also installed patched modules from Pelwell but we no success: simply it doesn't connect ...

Let me know if I can help,

My best,
Mimmo

@dh-connect has your issue been resolved? If so, please close this issue. Thanks.

I'm working with lan since, haven't tried wlan

Hi,

I've got what seems to be the same issue with my rpi 3. I've reverted to using the official RPI wifi usb dongle which is rock solid, but the built in wifi dies after ~20 hours of connectivity with these kind of messages in syslog

brcmfmac: brcmf_cfg80211_reg_notifier: not a ISO3166 code
cfg80211: World regulatory domain updated:
cfg80211: DFS Master region: unset

this is on latest raspbian, latest firmware

Is it possible to re-open this issue?
Why it was closed?

I'm working with lan since, haven't tried wlan
dh-connect closed this 13 days ago

This is not a solution worth closing the issue...

I still have the issue and can reproduce the bug.

My relevant portion of dmesg is:

[174174.396705] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
[174215.037175] brcmfmac: _brcmf_set_multicast_list: Setting mcast_list failed, -52
[174217.037166] brcmfmac: _brcmf_set_multicast_list: Setting allmulti failed, -52
[174219.037171] brcmfmac: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -52

I'm running into the same problem as @jrmhaig and upgraded now have

$ dpkg-query -s firmware-brcm80211
Package: firmware-brcm80211
Status: install ok installed
Priority: optional
Section: non-free/kernel
Installed-Size: 4296
Maintainer: Debian Kernel Team <[email protected]>
Architecture: all
Multi-Arch: foreign
Source: firmware-nonfree
Version: 0.43+rpi5
Suggests: initramfs-tools
Description: Binary firmware for Broadcom 802.11 wireless cards
 This package contains the binary firmware for wireless network cards with
 the Broadcom BCM4313, BCM43224, BCM43225, BCM43241, BCM43143, BCM4329,
 BCM4330, BCM4334, BCM4335 or BCM43430 chips, supported by the brcmsmac or
 brcmfmac driver.
 .
 Contents:
  * Broadcom 802.11 firmware, version 610.812 (brcm/bcm43xx-0.fw)
  * Broadcom 802.11 firmware header, version 610.812
    (brcm/bcm43xx_hdr-0.fw)
  * Broadcom BCM43143 firmware (brcm/brcmfmac43143-sdio.bin)
  * Broadcom BCM43241 rev 0-3 firmware (brcm/brcmfmac43241b0-sdio.bin)
  * Broadcom BCM43241 rev 4+ firmware (brcm/brcmfmac43241b4-sdio.bin)
  * Broadcom BCM4329 firmware (brcm/brcmfmac4329-sdio.bin)
  * Broadcom BCM4330 firmware (brcm/brcmfmac4330-sdio.bin)
  * Broadcom BCM4334 firmware (brcm/brcmfmac4334-sdio.bin)
  * Broadcom BCM4335 firmware (brcm/brcmfmac4335-sdio.bin)
  * Broadcom BCM43362 firmware (brcm/brcmfmac43362-sdio.bin)
  * Broadcom BCM4354 firmware (brcm/brcmfmac4354-sdio.bin)
  * Broadcom BCM43143 firmware (brcm/brcmfmac43143.bin)
  * Broadcom BCM43430 firmware (brcm/brcmfmac43430-sdio.bin)
  * NVRAM file for BCM943430 (brcm/brcmfmac43430-sdio.txt)
Homepage: http://git.kernel.org/?p=linux/kernel/git/firmware/linux-firmware.git

Setup hostapd with a bridge.

/etc/hostapd/hostapd.conf

ctrl_interface=/var/run/hostapd
###############################
# Basic Config
###############################
macaddr_acl=0 auth_algs=1
# Most modern wireless drivers in the kernel need driver=nl80211
driver=nl80211

#####
# Logging
#####
logger_syslog_level=0

##########################
# Local configuration...
##########################
interface=wlan0
bridge=br0
hw_mode=g
ieee80211n=1
channel=1
ssid=WillCrashOnYou
macaddr_acl=0
auth_algs=1
ignore_broadcast_ssid=0
wpa=3
wpa_passphrase=JustYouWait:)
wpa_key_mgmt=WPA-PSK
wpa_pairwise=TKIP
rsn_pairwise=CCMP

/etc/network/interfaces

# interfaces(5) file used by ifup(8) and ifdown(8)

# Please note that this file is written to be used with dhcpcd
# For static IP, consult /etc/dhcpcd.conf and 'man dhcpcd.conf'

# Include files from /etc/network/interfaces.d:
source-directory /etc/network/interfaces.d

auto lo
iface lo inet loopback

#auto eth0
iface eth0 inet manual
#iface eth0 inet dhcp

#allow-hotplug wlan0
iface wlan0 inet manual
#    wpa-conf /etc/wpa_supplicant/wpa_supplicant.conf
#
#allow-hotplug wlan1
#iface wlan1 inet manual
#    wpa-conf /etc/wpa_supplicant/wpa_supplicant.conf

auto br0
iface br0 inet dhcp
        post-up /etc/init.d/hostapd restart
        post-down /etc/init.d/hostapd stop
        bridge-ports eth0 wlan0

For people with WiFi problems, Cypress (was Broadcom) have provided us with debug modules to help diagnose the problems. Because modules are kernel-version-specific you will first need to update (or possible revert) to a specific firmware release:

sudo rpi-update b0ef6e25679d3612a980708cf4c3907ce6e13e84
sudo shutdown -r now

Now you can download and install the debug modules:

wget -O brcmdbg.tgz "https://drive.google.com/uc?export=download&id=0B_P-i4u-SLBXb1o0UjVLY1NRbk0"
tar zxvf brcmdbg.tgz
sudo ./brcmdbg

The final command will run the installation script, which copies the original modules to one side before replacing them with the debug versions. Running the command again will revert to the original versions.

After installation, reboot your Pi 3 - now dmesg | grep brcmfmac will show diagnostic message like this:

[    9.952095] brcmfmac: F1 signature read @0x18000000=0x1541a9a6
[    9.978064] usbcore: registered new interface driver brcmfmac
[   10.277931] brcmfmac: brcmf_c_preinit_dcmds: Firmware version = wl0: May 27 2016 00:13:38 version 7.45.41.26 (r640327) FWID 01-df77e4a7
[   10.299380] brcmfmac: CONSOLE: hndarm_armr addr: 0x18003000, cr4_idx: 0
[   10.314284] brcmfmac: CONSOLE: 000000.001
[   10.326859] brcmfmac: CONSOLE: RTE (SDIO-CDC) 7.45.41.26 (r640327) on BCM43430 r1 @ 37.4/81.6/81.6MHz
[   10.326867] brcmfmac: CONSOLE: 000000.001 sdpcmdcdc0: Broadcom SDPCMD CDC driver
[   10.326876] brcmfmac: CONSOLE: 000000.005 reclaim section 0: Returned 47716 bytes to the heap
[   10.326882] brcmfmac: CONSOLE: 000000.007 wlc_bmac_info_init: host_enab 1
[   10.326890] brcmfmac: CONSOLE: 000000.026 wl0: Broadcom BCM43430 802.11 Wireless Controller 7.45.41.26 (r640327)
[   10.326895] brcmfmac: CONSOLE: 000000.027 TCAM: 256 used: 179 exceed:0
[   10.326902] brcmfmac: CONSOLE: 000000.028 reclaim section 1: Returned 81268 bytes to the heap
[   10.326911] brcmfmac: CONSOLE: 000000.029 sdpcmd_dpc: Enable
[   10.371343] brcmfmac: CONSOLE: 000000.121 wl0: wlc_enable_probe_req: state down, deferring setting of host flags
[   10.422886] brcmfmac: brcmf_cfg80211_reg_notifier: not a ISO3166 code
[   10.432919] brcmfmac: CONSOLE: 000000.185 wl0: wlc_enable_probe_req: state down, deferring setting of host flags
[   10.432929] brcmfmac: CONSOLE: 000000.186 wl0: wlc_enable_probe_req: state down, deferring setting of host flags
[   10.500547] brcmfmac: CONSOLE: 000000.254 wl0: wl_open
[   10.531447] brcmfmac: brcmf_add_if: ERROR: netdev:wlan0 already exists
[   10.531457] brcmfmac: brcmf_add_if: ignore IF event
[   10.536516] brcmfmac: power management disabled
[   10.540645] brcmfmac: CONSOLE: 000000.284 wl0: wlc_enable_probe_req: state down, deferring setting of host flags
[   13.950422] brcmfmac: CONSOLE: 000003.703 wl_nd_ra_filter_clear_cache: Enter..

When you hit a problem, use dmesg > wifidbg.txt to capture the tracing to a file, along with any other kernel messages, then upload the file somewhere (gist, pastebin, dropbox etc.) and post a link to it along with a description of what you were doing when the error occurred.

please refresh my memory: what command to use to return to stable firnmware
if I decide to stop debugging?

sudo apt-get update
sudo apt-get upgrade

should do the trick. And sudo ./brcmdbg to just revert to the non-debug drivers.

https://gist.github.com/BenoitSvB/368983f2c09eed2d85a24e6920dc3a50#file-201609081547_wifidbg-txt

Started debugging; needed about 5 or 6 tries to associate; do not know why all but last attempt failed; will let it run until I see association loss and dump a new dmesg then. Inconsistent association behaviour was my problem before I stopped using onboard wifi so this might be on the spot. Please let me know if any additional activities could be helpfull.

https://gist.github.com/BenoitSvB/bf8acdbb7b664df90e885603bb4774ce#file-201609081628_wifidbg-txt
Doing nothing but waiting; do we see here several association losses/recoveries?

Thanks for that. Hmm - those logs aren't very informative, but let's see what Cypress come back with.

https://gist.github.com/BenoitSvB/98db53ff884e7b1a57bf1475d6106c56
Unexplained loss and recovery of association; long enough to see in systray icon.
Accesspoint is Linksys wrt160n with Firmware: DD-WRT v24-sp2 (08/07/10) std.
Guess I can stop debugging for now and revert to my €3 MT7601U dongle, but let me know if I can be of further help.

@pelwell I did not see any firmware restore after sudo apt-get update && sudo apt-get upgrade and sudo rpi-update gives
*** Your firmware is already up to date; Guess I need to run rpi-update with a specific git hash to revert to stable firmware. Do you know which hash?

The commit history in the RPI-Distro repo shows that you want commit 390f53ed0fd79df274bdcc81d99e09fa262f03ab from the firmware repo, so run:

sudo rpi-update 390f53ed0fd79df274bdcc81d99e09fa262f03ab

@pelwell:
root@pi3b:/home/pi# sudo rpi-update 390f53ed0fd79df274bdcc81d99e09fa262f03ab
** Raspberry Pi firmware updater by Hexxeh, enhanced by AndrewS and Dom
*
* Performing self-update
** Relaunching after update
*
* Raspberry Pi firmware updater by Hexxeh, enhanced by AndrewS and Dom
Invalid git hash specified

Ah, the Hexxeh rpi-firmware has different commit IDs - try 569e6611ac20c735647eb0e550484a73935a672d.

I wonder if https://github.com/raspberrypi/linux/issues/1552 / #1444 might be related to this issue as well.

I have recently deployed a 40xRPI3 setup which does some bluetooth stuff, we had to get usb wifi interfaces or else wlan would constantly freeze.. We now use the internal bl device and the internal wifi module is blacklisted in modprobe.d.

It might maybe be useful to do hcitool name 11:11:11:11:11:11 and see if that generates any interesting log entries as well.. I have just been following this issue, havent had the time to set up my lab environment to test anything myself. We had some wifi freezes without BT enabled but the combination of wifi+bt can more or less always kill wifi in a very short timespan.. This was always reproducable over any number of our rpi's

@pelwell
OK; uname -a gives Linux pi3b.thuis 4.4.13-v7+ #894 SMP Mon Jun 13 13:13:27 BST 2016 armv7l GNU/Linux
Just for information: where would anyone find the git hash for the actual stable firmware version?

@thomasf
although I have Bluetooth up, I have no use for it at the moment.hcitool name 11:11:11:11:11:11 does not return anything; which is, I suppose, to be expected as I am not connected to any device. Maybe I should buy me a BT audio device to play with.

Define stable.

The hash I (finally) gave you will is for the 20th June firmware release, which you will get if you run:

sudo apt-get update raspberrypi-kernel
sudo apt-get update raspberrypi-bootloader

I'm not aware of a single place that contains the hash of the most recent "stable" release, but by going through RPI-Distro as I did then cross-referencing with the Hexxeh repo you can get rpi-update hashes for any release you like. If you consider the 2016-05-23 release to be stable because it was part of the last full Raspbian release then you want hash 3b98f7433649e13cf08f54f509d11491c99c4c0b which translates to an rpi-update hash of 2b9c0bfacfc11ee8bb9b30dc9cdb36289698f8a8 .

@BenoitSvB Just running that hcitool command from a fresh boot without touching hci0 with any other software causes the wifi to start behaving badly in our tests, I don't know if matters if there are any other bluetooth devices but it is the smallest reproducable example I can think of for triggering the wifi freezing problems.

I've also tested external bt dongle + internal wifi but the internal wifi sometimes hangs even when the internal bcm bt driver isn't loaded. The "solution" (as in quick fix) for us was to use usb wifi adapters, that has been proved stable in our tests and production usage.

I still suspect #1313 as related.

Op 8-9-2016 om 18:07 schreef Thomas Frössman:

I've also tested external bt dongle + internal wifi but the internal
wifi sometimes hangs even when the internal bcm bt driver isn't
loaded. The "solution" (as in quick fix) for us was to use usb wifi
adapters, that has been proved stable in our tests and production usage.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-245649229,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFyzObJxRjzQ-uMUlfe8hjRasrfq3nkwks5qoDLXgaJpZM4HupC5.

@pelwell
stable would in this case be the firmware as released by the Foundation with its last publicized image and updated by "sudo apt-get update && sudo apt-get upgrade" only, so without invocation of rpi-update (with or without a speciific git hash, which is meant as I understood for upgrading to more recent firmware for specific purposes only).
Which brings me to the question: can I read the hash of my operational firmware before loading a new firmware for testing, to make a restore after testing easier as I would not trust myself conducting the cross-reference you mentioned...

Perhaps - cat /boot/.firmware_revision is written by rpi-update, but
without trying it I couldn't tell you if the Raspbian releases also write
it.

boot/.firmware-revision is a rpi-update thing (
https://www.raspberrypi.org/forums/viewtopic.php?t=106073&p=732449#p731830 )

But I found with:

zcat /usr/share/doc/raspberrypi-bootloader/changelog.Debian.gz

that I want indeed:

  • firmware as of 390f53ed0fd79df274bdcc81d99e09fa262f03ab

I understand the crossref from
https://github.com/RPi-Distro/firmware/commits/debian?author=popcornmix to
https://github.com/Hexxeh/rpi-firmware/commits/master is made on carefully
comparing dates and descriptions from commits.

Learned something; thnx :)

Op 8 sep. 2016 8:28 p.m. schreef "Phil Elwell" [email protected]:

Perhaps - cat /boot/.firmware_revision is written by rpi-update, but
without trying it I couldn't tell you if the Raspbian releases also write
it.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-245693018,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFyzOQ_pfODaEmuBGR6pQVXs2W6LggW8ks5qoFO2gaJpZM4HupC5
.

@BenoitSvB: Your traces seem to show a different kind of issue - the firmware isn't giving any clues about why you are being disconnected. You might get some more clues from a packet sniffer such as WaveShark.

@mathieugouin @dh-connect @juched78 @maciex @duncanmcdowell: I have a Cypress engineer who is keen to find out more about your issues; if you send an email to me - phil at raspberrypi dot org - I can put you in touch with him. If you want to speed things along, install the debug modules as outlined above and save the output of dmesg when things go wrong.

@pelwell Google did not not return much substantial on 'packet sniffer Waveshark' but I guess you meant WireShark. The fact that blacklisting brcmutil & brcmfmac while using a MT7601U dongle makes the erratic connect/disconnect behaviour disappear, combined with the frequent 'out-of-order' occurances (see #1313, now hidden but not solved) makes me suspect a Broadcom/Cypress hardware cause.
Wireshark might be of help, but I would need assistance to setup/conduct a serious debugging hardware effort.

Yes, I meant wireshark.

You could use the dumpcap utility (part of the text-mode tshark package) to record all activity to a file, then kill it when the dmesg log includes a suspicious message. Something like this:

sudo apt-get install -y tshark
# You can say no when it asks if non-superusers can capture packets
dumpcap -D
# if your wlan isn't interface 2, change the next command to match
# Leave dumpcap recording in the background
sudo dumpcap -i 2 -q -w packets.pcap &
# Search for the error message, then kill the capture
dmesg -w | grep --max-count 1 "wlc_enable_probe_req: state down, deferring setting of host flags" && sudo killall dumpcap

Note that although "grep --max-count 1" is supposed to stop after one match, it seems to require one more line of input to actually make it stop, but that shouldn't be a problem in practise.

If your capture file gets too large you can get dumpcap to use a fixed duration recording using the "-b duration:60" option (for one minute). There is the possibility that restarting the capture like this could happen at a bad time and lose the interesting packets, but this is unlikely. You can always make it less likely by increasing the duration.

@BenoitSvB There is a thread here that suggests disabling roaming in the Pi3 WiFi driver as a way of avoiding connectivity issues. Roaming allows a device to automatically move between APs with the same SSID, but that is likely to be less useful on a static device such as a Pi3, and there is a suggestion that it can eventually lead to a total loss of connectivity.

Could you try enabling the roamoff module parameter? You need to create create /etc/modprobe.d/brcmfmac.conf containing the following:

options brcmfmac roamoff=1

@pelwell: Disabling roaming is not the solution; but it make me play with different channels and a second accesspoint. I discovered that the onboard wifi adapter only has problems with some channels (e.g. 1, 5) and only on the Linksys WRT160N with DD-WRT firmware. Curiously though none of my other wifi clients shared this problems: they will connect without problems on all offered channels on both accesspoints. Good for me I have a stable workaround (not using channels onboard wifi has problems with) but no clarity in the matter.
Do you want me to conduct specific testing?
By the way do we need to set parameter
options brcmfmac debug = 1
in the /etc/modprobe.d/brcmfmac.conf while using the special test-drivers?
And do you know a way to measure the uptime of a wifi association: then I could more systematically test all channels for longer periods without making gigantic capture files.

I was assured that the requested debugging is enabled in the debug drivers by default (it has the same effect as options bcrmfmac debug=0x100000), but feel free to experiment with different values.

I'm not aware of any way to measure uptime for an association, other than polling frequently and hoping to spot a change.

A Cypress employee is aware of this thread, but drop me an email (phil at raspberrypi dot org) if you are happy to be contacted directly.

Hello,

Is there any progress on this issue? I can connect to my open Wi-Fi network, and after a random time I have this in my logs:

Sep 26 22:42:36 dhcpcd: wlan0: carrier lost
Sep 26 22:42:36 kernel: brcmfmac: brcmf_cfg80211_reg_notifier: not a ISO3166 code
Sep 26 22:42:36 kernel: cfg80211: World regulatory domain updated:
Sep 26 22:42:36 kernel: cfg80211: DFS Master region: unset
Sep 26 22:42:36 kernel: cfg80211: (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp), (dfs_cac_time)
Sep 26 22:42:36 kernel: cfg80211: (2402000 KHz - 2472000 KHz @ 40000 KHz), (N/A, 2000 mBm), (N/A)
Sep 26 22:42:36 kernel: cfg80211: (2457000 KHz - 2482000 KHz @ 40000 KHz), (N/A, 2000 mBm), (N/A)
Sep 26 22:42:36 kernel: cfg80211: (2474000 KHz - 2494000 KHz @ 20000 KHz), (N/A, 2000 mBm), (N/A)
Sep 26 22:42:36 kernel: cfg80211: (5170000 KHz - 5250000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2000 mBm), (N/A)
Sep 26 22:42:36 kernel: cfg80211: (5250000 KHz - 5330000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2000 mBm), (0 s)
Sep 26 22:42:36 kernel: cfg80211: (5490000 KHz - 5730000 KHz @ 160000 KHz), (N/A, 2000 mBm), (0 s)
Sep 26 22:42:36 kernel: cfg80211: (5735000 KHz - 5835000 KHz @ 80000 KHz), (N/A, 2000 mBm), (N/A)
Sep 26 22:42:36 kernel: cfg80211: (57240000 KHz - 63720000 KHz @ 2160000 KHz), (N/A, 0 mBm), (N/A)
Sep 26 22:42:36 kernel: cfg80211: Regulatory domain changed to country: CH
Sep 26 22:42:36 kernel: cfg80211:  DFS Master region: ETSI
Sep 26 22:42:36 kernel: cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp), (dfs_cac_time)
Sep 26 22:42:36 kernel: cfg80211:   (2402000 KHz - 2482000 KHz @ 40000 KHz), (N/A, 2000 mBm), (N/A)
Sep 26 22:42:36 kernel: cfg80211:   (5170000 KHz - 5250000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2000 mBm), (N/A)
Sep 26 22:42:36 kernel: cfg80211:   (5250000 KHz - 5330000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2000 mBm), (0 s)
Sep 26 22:42:36 kernel: cfg80211:   (5490000 KHz - 5710000 KHz @ 160000 KHz), (N/A, 2700 mBm), (0 s)
Sep 26 22:42:36 kernel: cfg80211:   (57000000 KHz - 66000000 KHz @ 2160000 KHz), (N/A, 4000 mBm), (N/A)
Sep 26 22:42:36 dhcpcd: wlan0: deleting address 2a02::xxxx/64
Sep 26 22:42:36 dhcpcd: wlan0: deleting default route via fe80::xxxx
Sep 26 22:42:36 dhcpcd: wlan0: deleting route to 2a02:xxxx::/64
Sep 26 22:42:36 dhcpcd: wlan0: deleting address fe80::xxxx
Sep 26 22:42:36 dhcpcd: wlan0: deleting route to 10.206.0.0/16
Sep 26 22:42:36 dhcpcd: wlan0: deleting default route via 10.206.0.1

And then I can't ping the router.

After a ifdown wlan0 && ifup wlan0 it works again, until the next wlan0: carrier lost.

Power management is disabled, bluetooth is disabled, roaming is disabled (as you suggested) and my version is Linux pi3 4.4.17-v7+.

it always happened when bridge eth0 with wlan0 ,i got the same issue as https://github.com/raspberrypi/linux/issues/1375

I have exactly the same issue of Pi3 onboard WiFi dropping out after a random period of time. ifup gets it running again no problem.

After much investigation, I found it was due to having three APs (BSSIDs) with one SSID (1 each on channel 1, 6, & 11). This setup supports seamless roaming and works perfectly with all other WLAN clients.

Enabling debugging/logging with standard driver seems to show that at some stage the Pi decides to deauthenticate and even blacklists one of the BSSIDs. Reason is unclear, but seems to be a decision made at the Pi end.

When I have exactly the same config on the Pi but with only one BSSID for the SSID, Pi can hang on for days without a hitch.

Unfortunately, disabling roaming as per pelwell's link (http://projectable.me/optimize-my-pi-wi-fi/) isn't really feasible, having only one BSSID per SSID isn't an option, and I'd rather not have to rely on a script that pings some host & then runs ifdown/ifup.

Is any further investigation being done towards supporting multiple BSSIDs per SSID, or can I do something specifically to support the investigation?

Thanks!

I'm having this problem and my network is similar to @TheOriginalMrWolf's.
I have an Apple base station and a airport express in a mesh configuration using WDS.

I'm having this issue too. If I copy files to a samba share, the wifi connection is lost (raspberry 3, new installed raspbian).
Syslog:
brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012

I'm getting exactly same issue when playing music with upnp (gmediarender).

I'm having the same issue when starting voice calls on wechat, with the rpi as an AP using hostapd. I get a bunch of spam like this:

[19841.278019] net_ratelimit: 940 callbacks suppressed
[19841.304748] brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!!
[19841.331372] brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!!
[19841.361587] brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!!
[19841.399362] brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!!
[19841.434506] brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!!
[19841.466598] brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!!
[19841.496736] brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!!
[19841.525425] brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!!
[19841.552678] brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!!

With traces like this:

[19837.728722] ------------[ cut here ]------------
[19837.730033] WARNING: CPU: 3 PID: 503 at drivers/net/wireless/brcm80211/brcmfmac/core.c:1191 brcmf_netdev_wait_pend8021x+0xdc/0xe8 [brcmfmac]()
[19837.732645] Modules linked in: xt_REDIRECT nf_nat_redirect xt_tcpudp nf_nat_pptp nf_nat_proto_gre nf_conntrack_pptp nf_conntrack_proto_gre iptable_filter ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack cdc_ether sr_mod cdrom brcmfmac brcmutil cfg80211 bcm2835_rng rng_core bcm2835_gpiomem bcm2835_wdt uio_pdrv_genirq uio sch_fq_codel snd_bcm2835 snd_pcm snd_timer snd ip_tables x_tables ipv6
[19837.743040] CPU: 3 PID: 503 Comm: hostapd Not tainted 4.4.38-1-ARCH #1
[19837.745188] Hardware name: BCM2709
[19837.747428] [<80015e54>] (unwind_backtrace) from [<80012ccc>] (show_stack+0x10/0x14)
[19837.752350] [<80012ccc>] (show_stack) from [<804f7dcc>] (dump_stack+0x94/0xb4)
[19837.755134] [<804f7dcc>] (dump_stack) from [<8002e958>] (warn_slowpath_common+0x84/0xb4)
[19837.760698] [<8002e958>] (warn_slowpath_common) from [<8002ea24>] (warn_slowpath_null+0x1c/0x24)
[19837.767009] [<8002ea24>] (warn_slowpath_null) from [<7f2a50b4>] (brcmf_netdev_wait_pend8021x+0xdc/0xe8 [brcmfmac])
[19837.774038] [<7f2a50b4>] (brcmf_netdev_wait_pend8021x [brcmfmac]) from [<7f2950b4>] (send_key_to_dongle+0x94/0xe8 [brcmfmac])
[19837.781637] [<7f2950b4>] (send_key_to_dongle [brcmfmac]) from [<7f2972a8>] (brcmf_cfg80211_add_key+0x16c/0x324 [brcmfmac])
[19837.789919] [<7f2972a8>] (brcmf_cfg80211_add_key [brcmfmac]) from [<7f125ae8>] (nl80211_new_key+0x11c/0x28c [cfg80211])
[19837.798477] [<7f125ae8>] (nl80211_new_key [cfg80211]) from [<807441ec>] (genl_rcv_msg+0x254/0x3c8)
[19837.807003] [<807441ec>] (genl_rcv_msg) from [<80743564>] (netlink_rcv_skb+0xb4/0xd8)
[19837.815674] [<80743564>] (netlink_rcv_skb) from [<80743f88>] (genl_rcv+0x24/0x34)
[19837.824371] [<80743f88>] (genl_rcv) from [<80742efc>] (netlink_unicast+0x188/0x218)
[19837.833161] [<80742efc>] (netlink_unicast) from [<807432cc>] (netlink_sendmsg+0x278/0x330)
[19837.842135] [<807432cc>] (netlink_sendmsg) from [<806fa454>] (sock_sendmsg+0x14/0x24)
[19837.851174] [<806fa454>] (sock_sendmsg) from [<806faadc>] (___sys_sendmsg+0x1d0/0x1d8)
[19837.860301] [<806faadc>] (___sys_sendmsg) from [<806fb780>] (__sys_sendmsg+0x3c/0x68)
[19837.869517] [<806fb780>] (__sys_sendmsg) from [<8000f240>] (ret_fast_syscall+0x0/0x34)
[19837.878793] ---[ end trace e4988f6034c7c2ec ]---

The trace looks suspiciously similar to @jrmhaig's.

I just had this happen again, and did some debugging. I got some different messages this time, which seem interesting (seems they are the same messages that @maciex got once):

[25353.256286] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[25355.254920] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[25355.257952] brcmfmac: brcmf_cfg80211_get_station: GET STA INFO failed, -52
  1. It looks like the whole system freezes when this happens. Running while sleep 1; do date; done in a loop results in a gap when the freeze occurs. I wonder if this means that brcmf_proto_bcdc_msg returning -110 (timeout) is just a symptom of the real issue -- it just logs wherever we freeze.
  2. I measured (with vcgencmd) the temperature and voltages at the time of the freeze. Nothing to report there, as far as I can tell.
  3. My system is an AP with forwarding to a ZTE 4G modem via USB (ie. client -> wlan0 -> rpi -> usb0 -> 4g. It seems that usb0 is still able to access the internet when the wifi freeze happens.

Re: the comments above, this happens in NAT sharing mode for me with roamoff=1. Neither of those fixed or mitigated the issue for me.

After disabling WPA (using create_ap -w 2 in my case to only enable WPA2), the problem seems fixed. Unclear why though.

I am also facing the issues reported here. In my case it happens whenever I access files (usually mp3) through Samba from Samsung + ES file manager and player.

My raspberry pi3 is wifi connected to my AP. Therefore all the communication with it is thought wifi network. It does not have any monitor nor keyboard nor mouse.

I can easily reproduce the error, so if anyone want me to produce log files, let me know how I could help.

Below my syslog entries.

Dec 27 16:11:50 raspberrypi kernel: [ 560.902063] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
Dec 27 16:11:52 raspberrypi kernel: [ 562.928930] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
Dec 27 16:11:54 raspberrypi kernel: [ 564.926659] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
Dec 27 16:11:54 raspberrypi kernel: [ 564.926820] brcmfmac: brcmf_cfg80211_get_station: GET STA INFO failed, -52
Dec 27 16:11:56 raspberrypi kernel: [ 566.924560] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
Dec 27 16:11:58 raspberrypi kernel: [ 568.922555] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
Dec 27 16:11:58 raspberrypi kernel: [ 568.928073] brcmfmac: brcmf_cfg80211_get_station: GET STA INFO failed, -52
Dec 27 16:12:00 raspberrypi kernel: [ 570.920675] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
Dec 27 16:12:02 raspberrypi kernel: [ 572.918980] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
Dec 27 16:12:02 raspberrypi kernel: [ 572.924580] brcmfmac: brcmf_cfg80211_get_station: GET STA INFO failed, -52
Dec 27 16:12:04 raspberrypi kernel: [ 574.917259] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
Dec 27 16:12:06 raspberrypi kernel: [ 576.915703] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
Dec 27 16:12:06 raspberrypi kernel: [ 576.921498] brcmfmac: brcmf_cfg80211_get_station: GET STA INFO failed, -52
Dec 27 16:12:06 raspberrypi ifplugd(wlan0)[1149]: Using detection mode: IFF_RUNNING

@rcassaniga
I also had the same problem with the identical setup.

Solution after hours of debugging:
Turn off IPv6 on the raspberry in /etc/modprobe.d/ipv6.conf:
alias net-pf-10 off
alias ipv6 off
options ipv6 disable_ipv6=1

This is only a workaround if you don't use ipv6 in your network.

Thank you @varl0g you are my hero! :)
Looks like this workaround is working for me, can't reproduce the problem any more.

@varl0g: It seams the workaround worked because I cannot reproduce the error.

Thanks and happy 2017.

I tried turning off ipv6. That didn't make a difference. I tried turning off power save mode. Still no difference. However, when I set my AP's channel to 6 (instead of 11), my Raspberry Pi has been up for 2 days with no problems!

I'd like to confirm that the workaround with turning off IPv6 doesn't work.
Unfortunately, I have the same issue with my RPi3 and Apple Airport Extreme router.

@rajid, @dh-connect
Surprisingly, it solved my issue too when I've changed my AP's wifi channel to 6 instead of automatic, thanks @rajid

i'm too have this bug - brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
Where fix????
i'm try 4.9 kernel, 4.4.41 kernel - all have this bug. Power supply 2.4a.

I have to revoke my previous comment regarding channel 6. Apparently, it was a coincidence that my RPI3 had stable WiFi for 6 days.

Just wondering if anyone has had any luck with this issue. I have tried disabling power management, bluetooth, and switching channels. Nothing so far has worked. I'm running Octoprint with a webcam attached. It seems to happen fairly often, and I notice it happens a lot more frequently when I have more than one http connection established.
syslog error prior to power saving mode:
brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
syslog error after power saving mode:
octopi kernel: [10317.342360] brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!! octopi kernel: [10317.342593] brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!! octopi kernel: [10327.358384] brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!!
I'm currently running Linux octopi 4.1.19-v7+ #858 SMP Tue Mar 15 15:56:00 GMT 2016 armv7l GNU/Linux

I finally got my RaspPi 3 to be stable on my wifi by changing my wifi's 2.4Ghz channel to "6". I forgot what it was before, 11 I think but I'm not sure. That didn't work well and I found a web page which said that was a problem but 6 works fine. It's been much better ever since I switched my house wifi to channel 6.

/raj

On Mar 3, 2017, at 8:39 PM, Daniel <[email protected] notifications@github.com> wrote:

Just wondering if anyone has had any luck with this issue. I have tried disabling power management, bluetooth, and switching channels. Nothing so far has worked. I'm running Octoprint with a webcam attached. It seems to happen fairly often, and I notice it happens a lot more frequently when I have more than one http connection established.
syslog error prior to power saving mode:
brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
syslog error after power saving mode:
octopi kernel: [10317.342360] brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!! octopi kernel: [10317.342593] brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!! octopi kernel: [10327.358384] brcmfmac: brcmf_sdio_bus_txdata: out of bus->txq !!!
I'm currently running Linux octopi 4.1.19-v7+ #858 SMP Tue Mar 15 15:56:00 GMT 2016 armv7l GNU/Linux


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/raspberrypi/linux/issues/1342#issuecomment-284126948, or mute the thread https://github.com/notifications/unsubscribe-auth/AFAlZVD-39p6wrK1h7WmH2Hc13mwu55Zks5riOr_gaJpZM4HupC5.

https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png https://github.com/raspberrypi/linux https://github.com/raspberrypi/linux/issues/1342#issuecomment-284126948

channel 6, channelwidth 20 MHz, looks stable for weeks now.

I had observed the same problem as first reported by @dh-connect, i.e. was seeing the error -52 log lines. Turning off power saving for the wifi interface did not help. Turning off ipv6 as suggested by @varl0g solved my issues. Wifi is now stable for days and days whereas it would formerly break down minutes after bootup.

I haven't had luck on channel 6 or 7. confirmed noone else on those channels.
I tried flashing my sd with a new image and now my wifi controllers aren't getting proper DHCP leases. They are booting up with 169.254.xx.xx local ips, not the subnet from my dhcp server.

Decided to just wipe it and go install newest raspbian and install octoprint from source. so far no issues.

From what I can tell, this is an issue in the driver software of brcm80211 sdio.c itself.
The string 0x40012 is really 0x00040012, which when interpreted using the masks and codes from here ~line 55, can be seen as a mailbox string indicating a flow control change to DEVREADY. What is odd, is that the string is never interpreted as such, and thus hits the backwards-compatible section of the driver ~line 1127 of the sdio.c file within the brcm80211/brcmfmac source here..

I don't have great experience with the driver, itself, nor the ability to recompile and test (I've only got one rpi3, and I'd rather not mess up the environment it is living in, currently. Also, I'm not versed in recompiling/updating linux drivers..) so I'm not exactly positive, but it seems that two HMB messages are being sent back-to-back so quickly, the driver doesn't have enough time to interpret them both.

For those wondering, I am currently running octoprint (Manually built) on my rpi3 over wireless (duh..) with the adafruit pitft2.8" capacitive touch screen and adafruit's custom kernel (v 4.4.27-v7+) and duplicate the issue when trying to access the video stream (Logitech C270) on my Samsung Galaxy S7 via PrintDroid pro or via Chrome. The lockup happens without-fail each time this is performed, and only happens on wireless. I have upgraded the power supply, disabled ipv6 and power management, to no avail.

@TGYK Can you check out the referenced issue - does it seem the same to you? What messages do you get in dmesg? kevent dropped?

@TGYK. You have linked to the original Broadcom github page - can you give some indication where the issues is appearing in the Raspberry Pi kernel tree on here? Bit difficult to track down what lines of code you are referring to.

sdio.c is here in the 4.9 tree https://github.com/raspberrypi/linux/tree/rpi-4.9.y/drivers/net/wireless/broadcom/brcm80211/brcmfmac.

@JamesH65 In the github page you linked, the line I am referring to lies around 1140-1147. As far as the dmesg error, the message is the same issue as seen above:
"Unknown mailbox data content: 0x40012", followed by escan (-52) errors.

The same issue as your referenced topic does not happen, as I am not bridging my wireless and wired interfaces in any way. As far as I can tell, my issue, and that of this thread pertains solely to the wireless interface.

Thanks for the information. The possibly linked issue is, I believe, similar in that it is an issue with the Wifi driver possibly getting an odd message causing more oddness later in the stack, but I am still digging.

I'm having the same issue with a raspberry pi Zero W with a similar symptom as @TGYK. In my case, I'm running mpd on the zero, and controlling it via an android client on a Samsung Galaxy S5. Without fail, if I put the phone in standby while the controller app is running (ie., without returning to the home screen first), the zero's wifi breaks with the "Unknown mailbox data content" message. If I just leave the device idling, or am careful to always close the app before letting my phone go to sleep, it stays up indefinitely.

I've had this issue on Raspian and now OSMC.

Mostly intermittent, but interestingly accessing the Kodi web interface from my S7 will always trigger this issue. Doing the same from my wife's iPhone works flawlessly and has never triggered the problem.

@daedalia : I have a very similar issue with a Samsung Galaxy Tab S. However, I don't have access to an iPhone/iPad device to confirm...

My Samsung device crashes the wifi when trying to access the tvheadend web interface.

It does not happen when accessed from a Firefox browser from a windows PC.

Glad I found this thread, thought I was the only one. I'm having the same issues as the posters above, wifi dropping out on my pi3/osmc when accessed from a Samsung Galaxy Tab A. Works fine if accessed from Nexus 7 tablet, OnePlus phone or Acer laptop, only the Samsung gives problems. Easily repeatable. Seems to be the samsung wifi driver doesn't like the inbuilt pi3 wifi? Adding a tp-link wifi dongle to the pi3 is a workaround for me.

@philborman I'm curious, do you use the same mobile browser on the Samsung vs the Nexus?

Both running chrome, but it's not just a browser issue. If i use Yatse to
control kodi it works fine from the nexus/mobile/laptop but pi3 WiFi drops
if i try the same from the samsung. Same if i ssh in, crashes with Samsung
and not the others. With ssh i can do a little, but any file transfers or
even editing a text file will cause the wifi to disconnect.

On Wed, 12 Apr 2017, 19:03 Mathieu Gouin, notifications@github.com wrote:

@philborman https://github.com/philborman I'm curious, do you use the
same mobile browser on the Samsung vs the Nexus?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-293643847,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ALmHOdtJ9AQtpfU7tmeVouI-a4STg2WMks5rvQPJgaJpZM4HupC5
.

Does anyone commenting here have the ability to build a kernel with a patch I have that might help with this? These are based on 4.9 but might work OK on 4.4. Note, these are only tests...

diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
index df60c98..82f618c 100644
--- a/drivers/net/usb/smsc95xx.c
+++ b/drivers/net/usb/smsc95xx.c
@@ -2076,6 +2076,13 @@ static struct sk_buff *smsc95xx_tx_fixup(struct usbnet *dev,
                        return NULL;
        }

+       if (skb_cloned(skb))
+       {
+               printk(KERN_ERR "Found a cloned skb");
+               if (pskb_expand_head(skb, 8, 0, GFP_ATOMIC))
+                              return NULL;
+       }
+
        if (csum) {
                if (skb->len <= 45) {
                        /* workaround - hardware tx checksum does not work
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.c
index a190f53..402beb1 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.c
@@ -2100,6 +2100,14 @@ int brcmf_fws_process_skb(struct brcmf_if *ifp, struct sk_buff *skb)
        int rc = 0;

        brcmf_dbg(DATA, "tx proto=0x%X\n", ntohs(eh->h_proto));
+
+       /* Possible we might receive a cloned skb, if this happens
+        * we must unclone it as we are going to be alter the data by
+        * adding headers.
+        * unclone will only do anything if it is cloned so no check required
+        */
+       skb_unclone(skb, GFP_ATOMIC);
+
        /* determine the priority */
        if ((skb->priority == 0) || (skb->priority > 7))
                skb->priority = cfg80211_classify8021d(skb, NULL);

Hello,

I have the very same problem described here with one of my 2 Pi3. Wifi loses connection after some time, can be anything between 30 minutes and a few hours. I've tried absolutely everything suggested here, including changing wifi channels on the AP, etc... with no success. What is extremely strange is that on my 2nd Pi3 (rev 1.2 too, exactly the same), and with the very same SD card/installation (Raspbian) that I swap between both, Wifi is rock solid for days and days...

This is really strange. Both Pi3 are updated with rpi-update, kernel 4.9 and firmware #991, but it was already the same with previous kernel/firmware releases.

If you do an rpi-update you will get the above patches as accepted by the kernel devs - this is for smsc9x driver and the brcmfmac driver, as of last night. Could you try that? If that still fails can you do 'dmesg' and see if there is anything odd in the syslog? Although, my suspicion is perhaps a HW fault apparent as the wireless chip warms up given that another Pi works fine with the same card.

Thanks. I did that on the suspicious board, wifi disconnected after a few minutes.
dmesg gives that:
``
[ 266.654964] brcmfmac: brcmf_sdio_bus_sleep: error while changing bus sleep state -110
[ 266.655033] brcmfmac: brcmf_sdio_txfail: sdio error, abort command and terminate frame
[ 266.659215] brcmfmac: brcmf_sdiod_regrw_helper: failed to write data F1@0x1000d, err: -110
[ 266.663346] brcmfmac: brcmf_sdiod_regrw_helper: failed to read data F1@0x1001a, err: -110
[ 266.667472] brcmfmac: brcmf_sdiod_regrw_helper: failed to read data F1@0x10019, err: -110
[ 266.671608] brcmfmac: brcmf_sdiod_regrw_helper: failed to read data F1@0x1001a, err: -110
[ 266.675736] brcmfmac: brcmf_sdiod_regrw_helper: failed to read data F1@0x10019, err: -110
[ 266.679866] brcmfmac: brcmf_sdiod_regrw_helper: failed to read data F1@0x1001a, err: -110
[ 266.683992] brcmfmac: brcmf_sdiod_regrw_helper: failed to read data F1@0x10019, err: -110
[ 269.655049] brcmfmac: brcmf_sdio_bus_sleep: error while changing bus sleep state -110
[ 272.069378] net_ratelimit: 35 callbacks suppressed

......... then this "loop" keeps filling the dmesg log several times per minute.

Edit: I touched all the components on the board, they are everything but hot, I would say around 30°, just a little warmer than my fingers skin.

Hmm, the SDIO stuff is the interface between the Pi and the wireless chip - it's timing out (-110). This does look like a HW issue - as the chip heats up, perhaps there is a bad solder joint on the sdio interface lines somewhere that means the comms disconnects.

Ping @Roger-Thornton - Roger, is there anything we can do to test this?

@Crrispy Can you check that your Pi isn't underpowered - what does vcgencmd get_throttled report?

@pelwell : after a wifi loss, I checked, and throttled=0x0

I don't think it's a hardware fault, a simple reboot always solves instantly the problem.

@JamesH65 I don't think this looks like a hardware manufacture issue as the lines are all functioning as they should be. If there are other pointers to hardware issues I can take a look at the board.

Doesn't seem to be the same problem as mine. I only have one pi3 and it's
wifi is rock solid until I connect from a Samsung tablet. Connect with
anything else and it's fine. Doesn't seem to be power or overheating
related as it's absolutely fine for days until I connect from the wrong
tablet and it falls over.

I'm guessing it's driver or firmware related, something that the samsung
driver sends that the pi3 doesn't like.

On Thu, 27 Apr 2017, 22:01 Crrispy, notifications@github.com wrote:

@pelwell https://github.com/pelwell : after a wifi loss, I checked, and
throttled=0x0


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-297823068,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ALmHOU6iNCr2w8vYXwveFIS7jcl71Dr9ks5r0PQBgaJpZM4HupC5
.

There have been a couple of fixes to networking in the latest raspbian -
when was the last time you did a

sudo apt-get update
sudo apt-get dist-upgrade

?
Might be worth a try to see if it fixes things.

On 28 April 2017 at 14:38, philborman notifications@github.com wrote:

Doesn't seem to be the same problem as mine. I only have one pi3 and it's
wifi is rock solid until I connect from a Samsung tablet. Connect with
anything else and it's fine. Doesn't seem to be power or overheating
related as it's absolutely fine for days until I connect from the wrong
tablet and it falls over.

I'm guessing it's driver or firmware related, something that the samsung
driver sends that the pi3 doesn't like.

On Thu, 27 Apr 2017, 22:01 Crrispy, notifications@github.com wrote:

@pelwell https://github.com/pelwell : after a wifi loss, I checked,
and
throttled=0x0


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<https://github.com/raspberrypi/linux/issues/1342#issuecomment-297823068
,
or mute the thread
ALmHOU6iNCr2w8vYXwveFIS7jcl71Dr9ks5r0PQBgaJpZM4HupC5>

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-297999952,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADqrHag08r5c96nB39R3F-mFW772qBbGks5r0evJgaJpZM4HupC5
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

I have the same issue with raspbery pi zero W. After some time I'm not able to ssh to it. I tried everything. One funny fact is...when I connected rpi to my TV to do some troubleshooting after I won't be able to ssh to it...it was working for 18h rock solid. Then, I switched hdmi for other device and after some time when I wanted to ssh to pi I got beautiful "no route to host" info. When I plug hdmi cable again I was able to ping gateway. No error in logs, iwconfig seems ok. systemctl restart networking helped.

As above, please try the latest Raspbian, and report back if you still see
the problem.

On 28 April 2017 at 19:30, frankja2 notifications@github.com wrote:

I have the same issue with raspbery pi zero W. After some time I'm not
able to ssh to it. I tried everything. One funny fact is...when I connected
rpi to my TV to do some troubleshooting after I won't be able to ssh to
it...it was working for 18h rock solid. Then, I switched hdmi for other
device and after some time when I wanted to ssh to pi I got beautiful "no
route to host" info. When I plug hdmi cable again I was able to ping
gateway. systemctl restart networking helped.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_raspberrypi_linux_issues_1342-23issuecomment-2D298073149&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=RRDqSoxb3C7hDEBxNO3XBNmSEOtX2e-ViBXtXxAJvMY&s=fnPJANeV-xMcDLPhx_cDGAdzEL2Lkk9HBD9Re7R8i2E&e=,
or mute the thread
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADqrHUqtTFP0QfIH-5FX9tlk9JtsUYZnsYks5r0jA2gaJpZM4HupC5&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=RRDqSoxb3C7hDEBxNO3XBNmSEOtX2e-ViBXtXxAJvMY&s=wkn8zDGV-kUL1yxzQL15ZaghSmFFncriyxZU91j_SSs&e=
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

I may be the only one boneheaded enough for this to be the issue, but I discovered that my wpa_supplicant.conf had the country code set wrong (Missed that it was a separate configuration item from the other localization options). I won't say that the problem went away entirely, but once I fixed that, it stopped losing its network connection in the "every single time I connect from my samsung" way it was before.

Just upgraded to latest (apt-get dist-upgrade) and it's looking hopeful.
My previous upgrade was about 2 weeks ago just before I reported the
initial problems. Working fine for the last couple of hours, no wifi
dropouts at all. Many thanks!

On 28/04/17 15:53, James Hughes wrote:

There have been a couple of fixes to networking in the latest raspbian -
when was the last time you did a

sudo apt-get update
sudo apt-get dist-upgrade

?
Might be worth a try to see if it fixes things.

On 28 April 2017 at 14:38, philborman notifications@github.com wrote:

Doesn't seem to be the same problem as mine. I only have one pi3 and
it's
wifi is rock solid until I connect from a Samsung tablet. Connect with
anything else and it's fine. Doesn't seem to be power or overheating
related as it's absolutely fine for days until I connect from the wrong
tablet and it falls over.

I'm guessing it's driver or firmware related, something that the samsung
driver sends that the pi3 doesn't like.

On Thu, 27 Apr 2017, 22:01 Crrispy, notifications@github.com wrote:

@pelwell https://github.com/pelwell : after a wifi loss, I checked,
and
throttled=0x0


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub

<https://github.com/raspberrypi/linux/issues/1342#issuecomment-297823068
,
or mute the thread
ALmHOU6iNCr2w8vYXwveFIS7jcl71Dr9ks5r0PQBgaJpZM4HupC5>

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub

https://github.com/raspberrypi/linux/issues/1342#issuecomment-297999952,
or mute the thread

https://github.com/notifications/unsubscribe-auth/ADqrHag08r5c96nB39R3F-mFW772qBbGks5r0evJgaJpZM4HupC5
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-298003537,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ALmHORKJxKdws0fMKU5tpfoJHJSah0Ffks5r0e9FgaJpZM4HupC5.

It's fixed for me on latest release.

Most of the other "fixes" seem to miss the point that my system worked
perfectly with everything except one tablet (Samsung) so it seems the
problem was the samsung sending something the pi3 wifi driver/firmware
couldn't cope with.

If my country code was set wrong, why would only the Samsung cause
issues. Other tablets/phones/laptop all connect fine.

Anyway, it's fixed now - at least it's not fallen over in the last few
hours. More time will tell...

On 28/04/17 21:09, rraszews wrote:
>

I may be the only one boneheaded enough for this to be the issue, but
I discovered that my wpa_supplicant.conf had the country code set
wrong (Missed that it was a separate configuration item from the other
localization options). I won't say that the problem went away
entirely, but once I fixed that, it stopped losing its network
connection in the "every single time I connect from my samsung" way it
was before.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-298082370,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ALmHOWtM-_MXCz5RoQd8XShI4Mk-4LAyks5r0jlUgaJpZM4HupC5.

In my case it last for around 19h. After that I couldn't ssh anymore...

What is the difference between rpi-update and dist-upgrade?

Because after rpi-update I had 4.9.25+ #995 and then I made dist-upgrade and kernel reverted to 4.9.24+ #993. Anyway for me problem is still not fixed. What I did this time is I used another rpi0w and different PSU :) last step is use other sd card.

OK, thanks for the information.

Going to need some more information to try and replicate. Your setup, what
you have connected and the sort of network traffic going on, any dmesg logs
or other error messages once the sh stops working.

Thanks.

On 29 April 2017 at 16:16, franko notifications@github.com wrote:

In my case it last for around 19h. After that I could ssh anymore...

What is the difference between rpi-update and dist-upgrade?

Because after rpi-update I had 4.9.25+ #995
https://github.com/raspberrypi/linux/pull/995 and then I made
dist-upgrade and kernel reverted to 4.9.24+ #993
https://github.com/raspberrypi/linux/pull/993. Anyway for me problem is
still not fixed.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-298175041,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADqrHQR8cadrEhb55YJj5PV6PP_odmJmks5r01RJgaJpZM4HupC5
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

Hello,

I've put the Pi3 in a case with a quite strong fan and temp in the room is currently 19°C so it can't be a heat problem. Swapped the power supply for another one (5V 3A too). Used another SD card, dist-upgrade then rpi-update.
Yesterday it was up for several hours , I hoped it was fixed but after 3-4 hours it disconnected (ping -t running from my windoze machine).
Tried again this morning, wifi down after less than 20 minutes :-(
Still the -110 error from the wifi driver on sdio (see above), which repeats in a loop until the reboot.
And my other Pi3 connected for 3-4 days now, no problem.
So this might look as a hardware failure. But.. why does it never fail at boot up, and always work after a reboot? Really puzzling.
Why does it try to change "bus sleep state" since power management is disabled for wlan0 ? (sorry if the question is dumb).

just done apt-get update; apt-get dist-upgrade. Unfortunately no changes for me. From my observation the issue relates to bridging wlan0 wonder if it could be related to the promiscuous mode. Have tired rpi-update as well to check 4.9.25

actually it's even worse than before as the connection get lost now usually just in few minutes and I can see usual logs

[  410.095280] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
[  523.447618] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[  526.007648] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[  526.007659] brcmfmac: brcmf_cfg80211_get_station: GET STA INFO failed, -110

"Because after rpi-update I had 4.9.25+ #995 and then I made dist-upgrade and kernel reverted to 4.9.24+ #993."

That's odd. I did the dist-upgrade, went to 4.9.24+ #993 and when I do a rpi-update right now, it says I already have the latest firmware and it has nothing to do... why doesn't it upgrade to 4.9.25 / #995 ?

Actually have to say that using brcmfmac/wlan0 bridged seems to work more stable than with pure wlan0 (all with hostapd)

So, can you give a full and accurate description of your setup, along with
types of connected devices, and any dmesg error messages you may receive
when the wireless fails.

I really need some way of replicating the issue that doesn't take hours, so
any information provided that can help towards that is gratefully accepted.

On 1 May 2017 at 17:27, Szymon Stasik notifications@github.com wrote:

Actually have to say that using brcmfmac/wlan0 bridged seems to work more
stable than with pure wlan0 (all with hostapd)


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_raspberrypi_linux_issues_1342-23issuecomment-2D298367138&d=DwMCaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=EjlHynB9dJ8jSAEyJJ1GhRYyOmqDmnvnudeSn-6_IGA&s=u8cZPP8YoQwzh97BQP6tqY2_2yZ30j_UKtU-Lrb3WCc&e=,
or mute the thread
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADqrHb-5FiQT-5FkgQciloIK9Zw7fsj2ju2kks5r1gfYgaJpZM4HupC5&d=DwMCaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=EjlHynB9dJ8jSAEyJJ1GhRYyOmqDmnvnudeSn-6_IGA&s=1_t1KVf3cAXu26O3AikloysPJ6Pi44P6C7y8pebOFww&e=
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

I don't know if this is specifically related to this aspect of the the issue but IIRC I was able to fully reproduce one way to wifi drop using a hcitool command.. Maybe it's not possible anymore, it was like a year ago now and we went with usb wifi to resolve the issue which worked for a bunch of rpis...

https://github.com/raspberrypi/linux/issues/1342#issuecomment-245637144

@thomasf What was your system setup (standalone device, access point, bridged access point etc) and on which machine do you execute the hcitool command? A quick test on a device attached to another Pi via wireless showed no problems.

@JamesH65 We went through a lot of scenarios and the problem were reproducible in any configuration..

When the hcitool command was run on an rpi it usually lost the (wifi) network connection within seconds.. IIRC it was easier to reproduce if there were some network traffic on the device (like a file transfer).

Quickly looking at our final provisioning system the following wpa_supplicant.conf was the final one we used.. I think it doesnt look all that different from the one which caused problems with the internal wifi interface,,, I'm sure that we started out using just a single AP while still getting problems..

(SSIDs and keys redacted):

country=ID
ctrl_interface=DIR=/var/run/wpa_supplicant GROUP=netdev
update_config=1

network={
    priority=10
    ssid="..."
    psk="..."
}

network={
    priority=10
    ssid="..."
    psk="..."
}

network={
    priority=10
    ssid="..."
    psk="..."
}

network={
    priority=10
    ssid="..."
    psk="..."
}

# Thomasf home AP
network={
    priority=1
    ssid="MKONION"
    psk=...
}

I just found a script called troublemaker.sh in the provisioning repo now..

It's very hacky, I think I provisioned it to run on start up ~like once every few minutes or so~~ (edit: probably just once since it does some looping by itself) on a bunch of rpi's to provoke problems and get some logs saved..

This was mainly used before I understood more about the problems.. I think that ping times and packet loss was rising for a period before the wifi totally disconnected..

#!/bin/bash

set -e

sudo killall ping hcitool bash || true
nohup sudo bash -c 'while true; do date; iwconfig ; sleep 60; done' >>${HOME}/troublemaker_iwconfig.log &
nohup sudo bash -c 'while true; do date; ifconfig ; sleep 60; done' >>${HOME}/troublemaker_ifconfig.log &
nohup sudo hcitool lescan --duplicates >>${HOME}/troublemaker_lescan.log &
nohup ping -s 50000 192.168.1.1 >> ${HOME}/troublemaker_ping.log &
nohup sudo bash -c 'while true; do sleep 60; date; sudo hcitool name 11:11:11:11:11:11 ; done' >>${HOME}/troublemaker_hcitoolname.log &

Running the troublemaker script on the latest stable Raspbian (4.9 kernel) shows no errors, which is good, but bad for trying to replicate errors!

@ciekawy Looking back, you appear to be able to easily replicate an issue which we are unable to do here. Can you give me some idea of your exact setup, so I can investigate. Also worth grabbing the very latest rpi-update as there have been some fixes in there for USB which may or may not have relevance (if you are using ethernet). I'll need to know your network topology, how the Pi is set up, what seems to be instigating the issue. Anything really!

@JamesH65, My current setup is:

auto lo
iface lo inet loopback

iface eth0 inet manual

allow-hotplug wlan2 # internet access - wlan2 is Atheros AR9271 using ath9k_htc
iface wlan2 inet manual
    wpa-conf /etc/wpa_supplicant/wpa_supplicant.conf

allow-hotplug wlan1 # internal AP 1 - D-Link using rt2800usb
iface wlan1 inet static
    post-up iwconfig wlan1 power off
    hostapd /etc/hostapd/hostapd1.conf
    address 10.114.0.11
    netmask 255.255.255.0
    network 10.114.0.0
    broadcast 10.114.0.255

allow-hotplug wlan0 # internal AP 2 - integrated using brcmfmac
iface wlan0 inet static
    hostapd /etc/hostapd/hostapd.conf
    address 10.114.0.10
    netmask 255.255.255.0
    network 10.114.0.0
    broadcast 10.114.0.255

auto br0 # helper bridge to be independent on the wlan interface being used
iface br0 inet static
bridge_ports wlan0 wlan1
    address 10.114.0.1
    netmask 255.255.255.0
    network 10.114.0.0
    broadcast 10.114.0.255

also as for brcmfmac

[    4.485558] brcmfmac: Firmware version = wl0: May 27 2016 00:13:38 version 7.45.41.26 (r640327) FWID 01-df77e4a7
[    9.306550] brcmfmac: power management disabled

It's worth mentioning that this RPI was running stable for days (although longer lasting 10mbps transfers was able to make some issues as well) when the roles was switched and wlan0 brcmfmac was used to connect to internet and local AP was running on wlan2 ath9k. I've changed config as I need to use better antenna connected to wlan2 for the internet access.

My recent rpi-updated was on May 1st

I have exact same problem in rpi3 using Archlinux-ARM.

After some hours running create_ap it stop working with that dmesg msgs already reported by others:
[11418.347554] brcmfmac: send_key_to_dongle: wsec_key error (-110)

Sometimes it work for 1 day without problem, and sometimes it work for minutes before the problem occur.

Linux alarm 4.9.25-2-ARCH #1 SMP Fri May 5 00:46:52 UTC 2017 armv7l GNU/Linux

Same issue on Pi Zero W, current Raspbian Lite. After some time (differs from minutes to hours) 'dmesg' shows
brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
From this point in time wifi connection is gone and can only be restarted by rmmod'ing and modprob'ing brcmfmac.

I disabled power management: no change.
I updated everything via apt-get upgrade / dist-upgrade: no change
I updated stuff via rpi-update: no change

brcmfmac is bugged for sure. I was heaving same problem with dmesg msg "brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012" and sometimes different messages too, like reported in my post above.

I am using a tp-link usb wifi adapter and my application is working fine now.

I hope broadcom can fix the bugs in brcmfmac.

Any workaround ?

As I mentioned early in this conversation, I changed my Wifi router to use channel 6 instead of 11 (which it was using before), and my rPi has been up ever since then (from back in Jan. until now) with no problems at all.

Might this be related to this kernel module note:

This generation of chips contain additional regulatory support independent of the driver. The devices use a single worldwide regulatory domain, with channels 12-14 (2.4 GHz band) and channels 52-64 and 100-140 (5 GHz band) restricted to passive operation. Transmission on those channels is suppressed until appropriate other traffic is observed on those channels. Within the driver, we use the ficticious country code “X2” to represent this worldwide regulatory domain. There is currently no interface to configure a different domain. The driver reads the SROM country code from the chip and hands it up to mac80211 as the regulatory hint, however this information is otherwise unused with the driver.
(from here: https://wireless.wiki.kernel.org/en/users/Drivers/brcm80211 )

I guess this means that even a country code "DE" (which should allow the higher wifi channels) has no effect? But I'm not sure that this could have an effect similar to the Unknown mailbox data content: 0x40012 issue...

At least for me it is no workaround - switched from channel 11 to channel 6 today, 2 hours later: Unknown mailbox data content: 0x40012

I had that issue until I increased signal strength by an range extender.
Could you test if connection is more stable moving the Pi to a spot with better signal?

Maybe it's caused by additional power needed to operate at poor signal strength.

Same problem as Crrispy.

For those that are working around this with a USB WiFi adapter (channel changing, etc, didn't work for me either), Edimax EW-7811Un worked immediately when I plugged it into my OTG USB cable on RPI Zero W. I didn't have to do any configuration or ifconfig - it was on the network right away! Yesterday, I flailed about with the TP-Link Archer T1U AC450 for a few hours.

@b3nj1 - sorry to butt in, but I got to ask --- why use an external wifi with a Zero W? Right, you know what the 'w' means. lol :)

I chose the same solution - bought an USB adapter with external antenna and mt7601 chipset (about 5 Eur) for my Zero W, works flawlessly. Should have bought the non-W in the first place ... this issue exists for more than a year and no fix in sight.

@blacktigersoftware - it is odd, isn't it!? The Zero W WiFi works great. The Zero W Bluetooth works great. But, if I use both at the same time, the system becomes unbearably slow and eventually unreachable through wifi.

Been having a quick look at the maibox issue described above. Google shows this seems to happening a fair bit (and at least one reference to a non-Pi platform). The driver code detects that a message coming back from the mailbox (I presume a connection to the HW firmware) has some bits set in that that it shouldn't have. However, it only prints the messages - doesn't do any recovering or error returns. Since this seems to be a value returned from the firmware, I don't have access to that to actually see what is going on, and the datasheet on the chip is entirely unhelpful. So I think this ones needs to be pushed to Broadcom/Cypress/linux-wireless for investigation.

Also worth noting we do appear to have the latest HW firmware according to https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/brcm. Files have one or two byte length differences but that's it.

The problem is the mailbox error is followed by another errors (-52, -110, etc), and only rebooting the system the wifi back to work.

-110 is a timeout error, indicative of something else dying and not responding. -52 is an invalid exchange, which is along the same lines. I suspect that by the time the mailbox error has occurred the firmware on the chip is unwell, so these other errors carry on from that.

Is anyone who can replicate the problem able to build the latest Pi dev kernel (4.11, available from our github) and see if the mailbox error still occurs. Before I start pushing upstream I'd like to know it still happens on the latest kernel, and I have not been able to replicate it.

I can confirm the problem happens in: Linux alarm 4.9.25-2-ARCH #1 SMP Fri May 5 00:46:52 UTC 2017 armv7l GNU/Linux

I haven´t tested in kernel 4.11

The driver used in my tests: brcmfmac: Firmware version = wl0: Dec 15 2015 18:10:45 version 7.45.41.23 (r606571) FWID 01-cc4eda9c

@b3nj1 - wow, thanks for the heads-up

Everybody - does this only happen when the gpu is turned on?

The GPU is always on (to some extent), in all models of Pi.

Do you mean when Bluetooth is on?

@JamesH65 - I'll give 4.11 a try. Do I just clone/build according to the following? When cloning according to those directions, I'm on the rpi-4.9.y branch. Should I checkout rpi-4.11.y instead or something else?

https://www.raspberrypi.org/documentation/linux/kernel/building.md

Thanks in advance

Checkout the rpi-4.11.y branch, then rebuild using the instructions you
have linked to.

On 25 May 2017 at 05:02, b3nj1 notifications@github.com wrote:

@JamesH65
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_jamesh65&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=MvrZEWZr46JRqX_2LLKdchnCVZJLmKJ9gMYoScOCXTc&s=kiIB6faklaD63EgzIvXgaWaSep5vCF5K06oTlqQQKb8&e=

  • I'll give 4.11 a try. Do I just clone/build according to the following?
    When cloning according, I'm on the rpi-4.9.y branch. Should I checkout
    rpi-4.11.y instead or something else?

https://www.raspberrypi.org/documentation/linux/kernel/building.md

Thanks in advance


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_raspberrypi_linux_issues_1342-23issuecomment-2D303916506&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=MvrZEWZr46JRqX_2LLKdchnCVZJLmKJ9gMYoScOCXTc&s=AGANXJT8mm2dDDBNh9M40Me6Y0E0V8bfRyuFuxauBlQ&e=,
or mute the thread
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADqrHcEFH69JTeMvuM4RIT3hJafMoVyiks5r9P1RgaJpZM4HupC5&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=MvrZEWZr46JRqX_2LLKdchnCVZJLmKJ9gMYoScOCXTc&s=BQHNOl8syT4Dp5uU3x6CKOUD2Eli4Z4xoPanb8_hnFI&e=
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

Just to repeat some information I've provided elsewhere, I've been testing the wireless connection under low power conditions. I've taken it down to the point of USB devices dropping out, but not seen any wireless connections issues. Whilst that is proof it is not a power issue, it's worth noting.

I happened to find how to reproduce this by running sudo memtester 256M 1 via SSH using my phone; the wifi dies as soon as memtester starts flooding with those "loading" characters:

Loop 1/1:
  Stuck Address       : ok
  Random Value        : \
                        ^-- Here

Strange thing, wifi only hangs while doing this on my phone. I've tried my PC, another pi and my router with no luck.

@JamesH65 - Update 2: I was able to boot with 4.11 (I misconfigured the kernel the first time).
Linux rpiz 4.11.2+ #2 Thu May 25 21:19:11 PDT 2017 armv6l GNU/Linux

Unfortunately, the system is still barley responsive when I hammer on BT.

When I plug back in the external USB WiFi and connect that adapter's address, everything is fine again.

  • Benjamin

New kernel built and installed from branch rpi-4.11.y following the instructions here: https://www.raspberrypi.org/documentation/linux/kernel/building.md.
Linux raspberrypi 4.11.2-v7+ #1 SMP Fri May 26 03:55:54 CEST 2017 armv7l GNU/Linux

Unfortunately, wifi still hangs with the same error:
brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012

If you console in when the wifi goes out you can restart it. I am testing a bash script right now to see if this helps. I am going to run it in cron. Here it is if anyone is interested.

#!/bin/bash

ping -q -c 3 192.168.254.1 > /dev/null

if [ $? -ne 0 ]
then
    systemctl restart [email protected]
    sleep 3
    ping -q -c 3 192.168.254.1 > /dev/null
    if [ $? -ne 0 ]
    then
        dhcpd wlan0
    fi
fi

exit

I have been running this for a day and so far I have not noticed my wifi drop.

@JR1994
Is it still working ?
How often are you running it ?
Every minute ?

I'll try it in some of my raspberrys, i have several that im restarting every time it cant ping the router

Thanks in advance

So far so good. I am checking every 2 min.

Note that the last firmware revision for brcmfmac is too old:

brcmfmac: Firmware version = wl0: Dec 15 2015 18:10:45 version 7.45.41.23 (r606571) FWID 01-cc4eda9c

@semeion Not sure what firmware you are using - the current one should be "Version: 7.45.41.26 CRC: 5932ca06 Date: Fri 2016-05-27 00:15:32 PDT Ucode Ver: 1043.2060 FWID: 01-df77e4a7". This is effectively the same as the one in the linux-firmware repo, although we do get ours direct from Brcm.

@JamesH65 That message was returned in dmesg.

$ dmesg | grep brcmfmac
[    7.242110] usbcore: registered new interface driver brcmfmac
[    7.337467] brcmfmac: Firmware version = wl0: Dec 15 2015 18:10:45 version 7.45.41.23 (r606571) FWID 01-cc4eda9c
[   15.072509] brcmfmac: power management disabled

But using vcgencmd version it show:

$ /opt/vc/bin/vcgencmd version

# Firmware Version #
May 30 2017 15:23:29 
Copyright (c) 2012 Broadcom
version b8cdd5ae76f39d9f353dfa8fb48bf7e33b74903c (clean) (release)

That's not the Wifi chip firmware, that's the SoC firmware, which gets
updated fairly frequently.

Still not sure why your system thinks it has that old firmware though. You
have very recent SoC firmware so presumably you have done a apt-get upgrade
recently?

On 5 June 2017 at 17:55, Alexandre Bolelli notifications@github.com wrote:

@JamesH65
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_jamesh65&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=W4TAJLAOB4LK3uzOCSYuvCg12E0PPs2YnLK7F3dSY6o&s=M_TSF6XbiHCAZO2_1FYozegsNPyrTwcm6HGX8iccJsg&e=
That message was returned in dmesg. But using vcgencmd version it show:
`$ /opt/vc/bin/vcgencmd version
Firmware Version

May 30 2017 15:23:29
Copyright (c) 2012 Broadcom
version b8cdd5ae76f39d9f353dfa8fb48bf7e33b74903c (clean) (release)`


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_raspberrypi_linux_issues_1342-23issuecomment-2D306242176&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=W4TAJLAOB4LK3uzOCSYuvCg12E0PPs2YnLK7F3dSY6o&s=w68PzYzJ8vnRpMlooVMqrykuimfbvRpWuispieW9KgU&e=,
or mute the thread
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADqrHTn-5FXFlZe4iParOh8BaB5IxFTATXks5sBDMUgaJpZM4HupC5&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=W4TAJLAOB4LK3uzOCSYuvCg12E0PPs2YnLK7F3dSY6o&s=8571drfpHyjrCl9XD_lHk65aTZxzWBxIm0grbwi225U&e=
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

@JamesH65 Like i said above, i am using Archlinux-ARM, it is rolling release distro, and yes my system is updated with pacman -Syu (pacman -Syu is apt-get upgrade/update equivalent).

No idea about why that old firmware is old in my system. Maybe it can be the reason of that bug. What do you think?

Anyways, the bug happens with raspbian right? The bug was reported in March 2016? It is old.

PS. English isn´t my native language, sorry for some error/misspelling.

OK, hadn't realised you were using ARCH. Sounds like they are not supplying
a recent firmware blob for the Wifi chip. You could update it manually, it
might fix your issue, it might not - I think there are probably multiple
wireless bugs, and there is no guarantee that you one you are seeing is the
one people are seeing on Raspbian.

You should report the out of date firmware to the arch maintainers, and
perhaps the wireless bug as well, as that might be down to the Arch distro.

Note that generally we don't support other distro's, our in house distro is
Raspbian, so to investigate an issue we need to be able to replicate it on
that.

On 5 June 2017 at 23:13, Alexandre Bolelli notifications@github.com wrote:

@JamesH65 https://github.com/jamesh65 Like i said above, i am using
Archlinux-ARM, it is rolling release distro, and yes my system is updated
with pacman -Syu (pacman -Syu is apt-get upgrade/update equivalent).

No idea about why that old firmware is old in my system. Maybe it can be
the reason of that bug. What do you think?

Anyways, the bug happens with raspbian right? The bug was reported in
March 2016? It is old.

PS. English isn´t my native language, sorry for some error/misspelling.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-306325452,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADqrHaL4XsN5drPggS8eJDZWme4LyKXWks5sBH2CgaJpZM4HupC5
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

@jamesH65 Yeah indeed. I will try ask in #archlinux-arm why that firmware is old. Anyways i will be following this issue and looking a solution. I will report here any discovered information.

Thanks in advance.

@JamesH65 I am able to replicate it consistently on my Raspbian (RPi 3). If there's something I can do to help with this, let me know!

What is your setup? How do you replicate the issue?

On 6 June 2017 at 14:17, Dan notifications@github.com wrote:

@JamesH65 https://github.com/jamesh65 I am able to replicate it
consistently on my Raspbian (RPi 3). If there's something I can do to help
with this, let me know!


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-306483030,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADqrHWyW5cQuS47k3xTmi3UX-QW7ffEYks5sBVF5gaJpZM4HupC5
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

Have a look in the previous comments, I've explained how to reproduce it not long ago.
The Pi's running full Raspbian with a 3.5" screen on top using the official power supply. Nothing fancy, everything is kept updated with rpi-update and apt upgrade.

After a few days the internal wifi stops working with the following message in dmesg:

[643660.135429] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
[643710.076781] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[643712.636821] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[643712.636834] brcmfmac: brcmf_cfg80211_get_station: GET STA INFO failed, -110
[643800.318024] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[643802.878064] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[643802.878076] brcmfmac: brcmf_cfg80211_get_station: GET STA INFO failed, -110
[643861.598874] brcmfmac: brcmf_cfg80211_del_station: SCB_DEAUTHENTICATE_FOR_REASON failed -110
[643862.558872] brcmfmac: brcmf_netdev_wait_pend8021x: Timed out waiting for no pending 802.1x packets
[643865.118918] brcmfmac: send_key_to_dongle: wsec_key error (-110)
[643867.679113] brcmfmac: brcmf_cfg80211_change_station: Setting SCB (de-)authorize failed, -110
[643868.638966] brcmfmac: brcmf_netdev_wait_pend8021x: Timed out waiting for no pending 802.1x packets
[643871.199007] brcmfmac: send_key_to_dongle: wsec_key error (-110)
[643873.759040] brcmfmac: brcmf_cfg80211_del_station: SCB_DEAUTHENTICATE_FOR_REASON failed -110
[643876.319079] brcmfmac: brcmf_cfg80211_change_station: Setting SCB (de-)authorize failed, -110
[643878.879108] brcmfmac: brcmf_cfg80211_del_station: SCB_DEAUTHENTICATE_FOR_REASON failed -110
[643881.439147] brcmfmac: brcmf_cfg80211_del_station: SCB_DEAUTHENTICATE_FOR_REASON failed -110
[643883.999183] brcmfmac: brcmf_cfg80211_del_station: SCB_DEAUTHENTICATE_FOR_REASON failed -110
[643886.559225] brcmfmac: brcmf_cfg80211_del_station: SCB_DEAUTHENTICATE_FOR_REASON failed -110
[652339.956933] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110

I run hostapd on this interface and have another usb wifi interface attached to the Pi. My system information:

pi@raspberrypi:~ $ uname -a
Linux raspberrypi 4.9.24-v7+ #993 SMP Wed Apr 26 18:01:23 BST 2017 armv7l GNU/Linux
pi@raspberrypi:~ $ lsb_release -a
No LSB modules are available.
Distributor ID: Raspbian
Description:    Raspbian GNU/Linux 8.0 (jessie)
Release:        8.0
Codename:       jessie

Yeah, and when show that (-110) you need reboot the system to get it working again...

Nice to know it happens in Raspbian too, the bug is distro independent. Happens the same in Archlinux.

However, since I moved my wifi from channel 11, to channel 6 instead, I haven't seen the problem since. I see, from my previous replies on this thread, that it's been since Jan. 7th when I made the change to channel 6. I'm currently running two RaspPI Zero W's and one RaspPi 3, all with no problems. The two RaspPi W's are running DietPi.

I also have this issue on my Raspberry Pi 3. Tried different wifi channels already.
I observed that if I connect the LAN port as well, wifi is stable as hell. As soon as I unplug the LAN port, wifi keeps dropping all the time.

That's really weird......!

On 15 June 2017 at 23:02, macmeck notifications@github.com wrote:

I also have this issue on my Raspberry Pi 3. Tried different wifi channels
already.
I observed that if I connect the LAN port as well, wifi is stable as hell.
As soon as I unplug the LAN port, wifi keeps dropping all the time.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-308878043,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADqrHbUKBO9mG3xpKHFK977h4hrFUhrGks5sEantgaJpZM4HupC5
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

I have the
"brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012"
issue as well in my rpi3. my most reliable workaround to prevent this error has been
"wondershaper 9000 9000"
I hope the root cause is determined.

I have the exact same problem. My pi3 has the following symptoms when connected with WIFI ONLY:

  1. OUTGOING wifi works GREAT. I can connect to the internet and download files with no problems on my pi3.
  2. ALL INCOMING wifi connections fail. Pings timeout, port 80 http accesses timeout, ssh fails, everything fails INBOUND ONLY.
    NOTE:
  3. Once Ethernet is connected to the pi3, then the wifi works BETTER but it is still dropping packets.
  4. Once Ethernet is removed again, wifi completely fails all inbound connections.
  5. Once Ethernet is connected again to the pi3, the wifi works BETTER and allows some incoming packets. but it still drops many of them.

Please fix this!

I have noticed the following on ifconfig:

RX packets: 1613 errors:0 dropped:1074 overruns:0 frame:0
TX packets:146 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000

So basically the RX side of the pi3's WIFI is dropping packets like crazy. No wonder why it won't respond to incoming connections. TX works fine!

Ever since I set up that script I have had no issues with wifi on both my
RPI3s.

On Wed, Jun 21, 2017 at 4:26 AM, Edward Kang notifications@github.com
wrote:

I have noticed the following on ifconfig:

RX packets: 1613 errors:0 dropped:1074 overruns:0 frame:0
TX packets:146 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000

So basically the RX side of the pi3's WIFI is dropping packets like crazy.
No wonder why it won't respond to incoming connections.

PLEASE FIX THIS!!


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-310049620,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFHmIH6kkxraxahE22_PpstdDkqW8Pgqks5sGP3ggaJpZM4HupC5
.

It's all very well saying "'please fix this" but problems like this are absolute bastards to find. It took a month to find an error in the smsc/brcmfmac drivers when bridging, and I was lucky to find it, and I suspect this one is rarer and more difficult to find. If anyone can find a replicatable test case that shows the error quickly, that would be of great help. Some people seem to get the kevent error frequently, I get it very rarely.

As for the issue with dropped packets, this is being looked in to when I have gaps in schedule. In the case above, you seem to be dropping almost all the packets, which is most odd, and not usually seen by the majority of people. Does this happen with all devices connected to the Pi? Or just one in particular.

sorry, james!

I'm not sure what you mean by all devices connected to the Pi. The dropped packets are from performing ifconfig directly on the pi. The pi is connected via wifi to a router. When the pi is connected to the wifi network only, it is constantly receiving and dropping packets.

@JamesH65 Well, i agree with you, it is hard to solve... But using Arch Linux-ARM, installing "create_ap" package and enabling it (pacman -S create_ap; systemctl start/enable create_ap), you can get the -110 error and the "Unknown mailbox data content: 0x40012" in few minutes of operation... Just connect your smartphone and/or a smart TV on it sometimes and the error will come.

We do not support Arch, Raspbian is our supported OS and that is the one I
need to be able to fix the issue in. I have no idea what version of the
kernel or drivers that ARCH uses, they could be very different from the
ones in Raspbian.

Are the people still seeing the issue using the Pi as an access point?
Using bridging? IPv4 or IPv6? This is the sort of information (not
exclusive of course, as much information as possible is required) required
to replicate issues.

Note that Broadcom have been informed of the mailbox error (It's their chip
and driver of course),but things tend to move slowly with them.

On 21 June 2017 at 18:27, Alexandre Bolelli notifications@github.com
wrote:

@JamesH65
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_jamesh65&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=BDIwUx7SC6sTRvRQKgA0QZB_ZlIJDs3bd_wzKoIw_7w&s=o90aBGb27vZvog3BdioLSa2_MEySix0ymtnTgiNb87c&e=
Well, i agree with you, it is hard to solve... But using Arch Linux-ARM,
installing "create_ap" package and enabling it (pacman -S create_ap;
systemctl /startenable create_ap), you can get the -110 error and the
"Unknown mailbox data content: 0x40012" in few minutes of operation... Just
connect your smartphone and/or a smart TV on it sometimes and the error
will come.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_raspberrypi_linux_issues_1342-23issuecomment-2D310149166&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=BDIwUx7SC6sTRvRQKgA0QZB_ZlIJDs3bd_wzKoIw_7w&s=bv5qC2cUEdCUx-HsDkQYbYJ1fuscyuPU_iPIGs7ViHA&e=,
or mute the thread
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADqrHfDuqt5fcQ3ODkJUFKxuUaWgUpIhks5sGVKfgaJpZM4HupC5&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=BDIwUx7SC6sTRvRQKgA0QZB_ZlIJDs3bd_wzKoIw_7w&s=Ojyj5WoAXeLeCsvarhv2rrmQUVoQGkjZmsfPB2lrOUw&e=
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

Im using static ipv4 on some devices and having the same problem as the
others using dhcp

2017-06-22 4:06 GMT-03:00 James Hughes notifications@github.com:

We do not support Arch, Raspbian is our supported OS and that is the one I
need to be able to fix the issue in. I have no idea what version of the
kernel or drivers that ARCH uses, they could be very different from the
ones in Raspbian.

Are the people still seeing the issue using the Pi as an access point?
Using bridging? IPv4 or IPv6? This is the sort of information (not
exclusive of course, as much information as possible is required) required
to replicate issues.

Note that Broadcom have been informed of the mailbox error (It's their chip
and driver of course),but things tend to move slowly with them.

On 21 June 2017 at 18:27, Alexandre Bolelli notifications@github.com
wrote:

@JamesH65
3A__github.com_jamesh65&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQz
Ju1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=
BDIwUx7SC6sTRvRQKgA0QZB_ZlIJDs3bd_wzKoIw_7w&s=o90aBGb27vZvog3BdioLSa2_
MEySix0ymtnTgiNb87c&e=>
Well, i agree with you, it is hard to solve... But using Arch Linux-ARM,
installing "create_ap" package and enabling it (pacman -S create_ap;
systemctl /startenable create_ap), you can get the -110 error and the
"Unknown mailbox data content: 0x40012" in few minutes of operation...
Just
connect your smartphone and/or a smart TV on it sometimes and the error
will come.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
3A__github.com_raspberrypi_linux_issues_1342-23issuecomment-2D310149166&d=
DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_
2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=BDIwUx7SC6sTRvRQKgA0QZB_
ZlIJDs3bd_wzKoIw_7w&s=bv5qC2cUEdCUx-HsDkQYbYJ1fuscyuPU_iPIGs7ViHA&e=>,
or mute the thread
3A__github.com_notifications_unsubscribe-2Dauth_
ADqrHfDuqt5fcQ3ODkJUFKxuUaWgUpIhks5sGVKfgaJpZM4HupC5&d=DwMFaQ&c=DpyQ_
ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_
2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=BDIwUx7SC6sTRvRQKgA0QZB_
ZlIJDs3bd_wzKoIw_7w&s=Ojyj5WoAXeLeCsvarhv2rrmQUVoQGkjZmsfPB2lrOUw&e=>
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-310294786,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ACeQBFj8ICNkDl7xYwYJcD6TK-k6_4K5ks5sGhJ1gaJpZM4HupC5
.

One thing to note is that the wifi was working perfectly for me from the time I got my pi3 last year to about 3 months ago when the wifi stopped working.

Clearly there must have been some kind of the change to the software around that time that caused the wifi to stop working.

If you wifi has stopped working completely, then it indicates an issue at your end (which may be compounded by a software issue of course), because for everyone else, Wifi generally works (although I do see dropped packets).

BTW my rpi3 is brand new make in UK.

I have been fighting this as well for a few months. Sometimes it lasts for minutes. Sometimes weeks. The common denominator when I lose a connection is I see the calls to reset the CRDA world regulatory domain immediately before it loses connection. Every single time. Ubiquiti AC access point, channel 11, channel width HT40 (only thing that might be special).

Jun 28 14:19:31 raspberrypi kernel: [ 980.387378] cfg80211: World regulatory domain updated:
Jun 28 14:19:31 raspberrypi kernel: [ 980.387387] cfg80211: DFS Master region: unset
Jun 28 14:19:31 raspberrypi kernel: [ 980.387396] cfg80211: (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp), (dfs_cac_time)
Jun 28 14:19:31 raspberrypi kernel: [ 980.387411] cfg80211: (2402000 KHz - 2472000 KHz @ 40000 KHz), (N/A, 2000 mBm), (N/A)
Jun 28 14:19:31 raspberrypi kernel: [ 980.387426] cfg80211: (2457000 KHz - 2482000 KHz @ 20000 KHz, 92000 KHz AUTO), (N/A, 2000 mBm), (N/A)
Jun 28 14:19:31 raspberrypi kernel: [ 980.387439] cfg80211: (2474000 KHz - 2494000 KHz @ 20000 KHz), (N/A, 2000 mBm), (N/A)
Jun 28 14:19:31 raspberrypi kernel: [ 980.387453] cfg80211: (5170000 KHz - 5250000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2000 mBm), (N/A)
Jun 28 14:19:31 raspberrypi kernel: [ 980.387468] cfg80211: (5250000 KHz - 5330000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2000 mBm), (0 s)
Jun 28 14:19:31 raspberrypi kernel: [ 980.387481] cfg80211: (5490000 KHz - 5730000 KHz @ 160000 KHz), (N/A, 2000 mBm), (0 s)
Jun 28 14:19:31 raspberrypi kernel: [ 980.387493] cfg80211: (5735000 KHz - 5835000 KHz @ 80000 KHz), (N/A, 2000 mBm), (N/A)
Jun 28 14:19:31 raspberrypi kernel: [ 980.387505] cfg80211: (57240000 KHz - 63720000 KHz @ 2160000 KHz), (N/A, 0 mBm), (N/A)
Jun 28 14:19:32 raspberrypi kernel: [ 981.262521] cfg80211: Regulatory domain changed to country: US
Jun 28 14:19:32 raspberrypi kernel: [ 981.262536] cfg80211: DFS Master region: FCC
Jun 28 14:19:32 raspberrypi kernel: [ 981.262540] cfg80211: (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp), (dfs_cac_time)
Jun 28 14:19:32 raspberrypi kernel: [ 981.262549] cfg80211: (2402000 KHz - 2472000 KHz @ 40000 KHz), (N/A, 3000 mBm), (N/A)
Jun 28 14:19:32 raspberrypi kernel: [ 981.262557] cfg80211: (5170000 KHz - 5250000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2300 mBm), (N/A)
Jun 28 14:19:32 raspberrypi kernel: [ 981.262565] cfg80211: (5250000 KHz - 5330000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2300 mBm), (0 s)
Jun 28 14:19:32 raspberrypi kernel: [ 981.262571] cfg80211: (5490000 KHz - 5730000 KHz @ 160000 KHz), (N/A, 2300 mBm), (0 s)
Jun 28 14:19:32 raspberrypi kernel: [ 981.262578] cfg80211: (5735000 KHz - 5835000 KHz @ 80000 KHz), (N/A, 3000 mBm), (N/A)
Jun 28 14:19:32 raspberrypi kernel: [ 981.262584] cfg80211: (57240000 KHz - 63720000 KHz @ 2160000 KHz), (N/A, 4000 mBm), (N/A)

Sorry to throw fuel on the fire, but having a similar problem on the Pi Zero W too, I think.

When switching wlan0 between access point mode (when using hostapd) and normal connection mode (i.e. connecting to a router) wlan0 will sometimes lose the ability to associate with an access point.

It will get stuck in this state:

~iwconfig wlan0 
wlan0     IEEE 802.11  ESSID:off/any
          Mode:Managed  Access Point: Not-Associated   Tx-Power=31 dBm

and nothing short of a reboot will fix it. I notice in dmesg the following errors when this happens:

[Wed Jul  5 16:08:27 2017] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[Wed Jul  5 16:09:07 2017] brcmfmac: brcmf_cfg80211_stop_ap: setting INFRA mode failed -7
[Wed Jul  5 16:10:16 2017] brcmfmac: brcmf_cfg80211_stop_ap: setting INFRA mode failed -7
[Wed Jul  5 16:10:18 2017] brcmfmac: brcmf_vif_set_mgmt_ie: vndr ie set error : -30
[Wed Jul  5 16:10:18 2017] brcmfmac: brcmf_cfg80211_scan: scan error (-30)
[Wed Jul  5 16:10:37 2017] brcmfmac: brcmf_vif_set_mgmt_ie: vndr ie set error : -30
[Wed Jul  5 16:10:37 2017] brcmfmac: brcmf_cfg80211_scan: scan error (-30)

What worries me is its completely arbitrary and random. I can sometimes switch between the two modes for quite a while before the problem happens. But it eventually does.

FWIW I think reloading the wifi kernel module (by doing "modprobe -r -v brcmfmac && modprobe brcmfmac") fixed it so I'll just have to create a script that does this whenever my Pi has wifi issues.

This while thing is strange. I had these types of issues on Raspberry pi zero & zero W, but they went away completely when I switched channels (as discussed earlier in this thread).

Also, lately I've been using the DietPi OS and haven't had any problems at all. You may want to try that.

I'd really liked to look into the problem, having seen it before, but I just can't get it to happen these days! :(

/raj
(sent from iPhone)

On Jul 5, 2017, at 9:01 AM, timdonovanuk notifications@github.com wrote:

FWIW I think reloading the wifi kernel module (by doing "modprobe -r -v brcmfmac && modprobe brcmfmac") fixed it so I'll just have to create a script that does this whenever my Pi has wifi issues.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

The more people who can look in to this the better, I'm limited in the time
I can spend on this at the moment due to other projects. One major problem
is a solid mechanism to replicate it.

On 5 July 2017 at 17:10, rajid notifications@github.com wrote:

This while thing is strange. I had these types of issues on Raspberry pi
zero & zero W, but they went away completely when I switched channels (as
discussed earlier in this thread).

Also, lately I've been using the DietPi OS and haven't had any problems at
all. You may want to try that.

I'd really liked to look into the problem, having seen it before, but I
just can't get it to happen these days! :(

/raj
(sent from iPhone)

On Jul 5, 2017, at 9:01 AM, timdonovanuk notifications@github.com
wrote:

FWIW I think reloading the wifi kernel module (by doing "modprobe -r -v
brcmfmac && modprobe brcmfmac") fixed it so I'll just have to create a
script that does this whenever my Pi has wifi issues.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_raspberrypi_linux_issues_1342-23issuecomment-2D313150296&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=UAE2wwxV4_BdJX0zfG2qnu3kAD_j1y0Js_FZxpJl4b4&s=haaEuyne9neeuPZzAlNI2PG7DctVLxxfwV3oezxYcwI&e=,
or mute the thread
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADqrHU6ugUl6QkcLNobslh5Th7hcXeecks5sK7VggaJpZM4HupC5&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=UAE2wwxV4_BdJX0zfG2qnu3kAD_j1y0Js_FZxpJl4b4&s=8TZEHLn2evTT1wzFzZo2CHYC2Zb0ydjsR39j-vskecM&e=
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

@timdonovanuk could be nice share your script with us, i am looking for some workaround. Maybe some monitoring script running like systemd service... What do you think?

Is there a way for me to trigger the regulatory domain update manually? Like I said, it seems to be pretty consistent for me whenever that runs, the connection drops. I'd be interested to run it manually a few times to see if I can reproduce reliably for you.

@rajid, by chance are you running at channel width 40? And do you remember if you were seeing similar world regulatory updates prior to drops? Curious if maybe there's a combination there around channel 11 and the extra wide channel width... and what kind of router / AP are you using? Just looking for any commonality, since I am seeing this on channel 11 as well, as were others... My AP is a Ubiquiti.

Workaround to switched from automatic channel on Apple extreme to channel 6 didn't work for me. I will use LAN during vacation.

Edit: Now I lose connection even with LAN, there most be something more here, is it a heat problem using the Official case (no fan)?

Hello,
I am facing a very similar issue on a Raspberry Pi Zero W.

I have developped an API running with Node.JS on the Pi and integrated with GPIOs.
The Pi is connected to my LAN through Wifi. Everything works great when PC clients call the API. However, as soon as I query my API with an Android device, the Pi crashes. Thoses crashes happen at random : sometimes API can be called by Android devices multiple times and suddenly the crash happens.
What I mean by crash is a ping loss but Pi is still up and running.

Calling same API through PCs never trigger any crash.

I tried to change Wifi channel but did not get any better results.

If I can run anything to help for diagnostic/solution, feel free to ask!

Anything in this forum post help?

https://www.raspberrypi.org/forums/viewtopic.php?f=29&t=188043#p1185246

On 11 July 2017 at 16:22, matthiasbou notifications@github.com wrote:

Hello,
I am facing a very similar issue on a Raspberry Pi Zero W.

I have developped an API running with Node.JS on the Pi and integrated
with GPIOs.
The Pi is connected to my LAN through Wifi. Everything works great when PC
clients call the API. However, as soon as I query my API with an Android
device, the Pi crashes. Thoses crashes happen at random : sometimes API can
be called by Android devices multiple times and suddenly the crash happens.
Calling same API through PCs never trigger any crash.

I tried to change Wifi channel but did not get any better results.

If I can run anything to help for diagnostic/solution, feel free to ask!


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-314479400,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADqrHYDohQoNRBDcX4oG49rK9e6kwpjjks5sM5MpgaJpZM4HupC5
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

@matthiasbou

Interesting what you said, my broadcom driver returns error -110 (sometimes another error) and crash exactly the moment I connect my Motorola X2 (Android) smartphone. But also the same error occurs when connecting my SmartTV. Anyway, I can confirm that the crash occurs at the time the connection is made.

My country is set correctly, ipv6 disable and roamoff=1, i am using channel 6, the problem still happening. wifi power saving mode and bluetooth is disabled by default in my distro.

@JamesH65 : I tried the interesting solution of setting correct country (it was not the case), disabling IPV6 and roaming but still having the same problem :(

Wifi connects, but as soon as I start to "play" with an Android device making some API calls on the Pi Zero W, after a while, it crashes.

Why should disabling IPv6 fix Wifi problem? Is there a sane explanation why IPv6 is involved, which is reproducible? The only thing I can think of that IPv6 may have some slight additional multicast load due to RAs.

For what it is worth, I'm running two Pi Zero Ws as IPv6 bridges between integrated wlan0 and external eth0, with IPv4 blocked. wlan0 is in AP mode and has the ISC dHCPv4 server running. Im connecting various Android tablets and smartphones to it. Did not notice any problems so far, but maybe I need to let them run for longer periods of time. I'm using channel 6.

Sorry, I'm using an Apple Airport box, and there's no setting or mention of "channel width". I simply set channel 6 for the 2.3Ghz net. I am now using DietPi on my little RaspPi Zero W systems. The other RaspPi's I have were setup a long time back with Edimax USB and have never had any problems. I believe the only time I saw problems was with Raspbian on the Zero W system. I'll have to load that up again and see if I can reproduce it.

/raj

On Jul 5, 2017, at 3:19 PM, Michael Hallock <[email protected] notifications@github.com> wrote:

Is there a way for me to trigger the regulatory domain update manually? Like I said, it seems to be pretty consistent for me whenever that runs, the connection drops. I'd be interested to run it manually a few times to see if I can reproduce reliably for you.

@rajid https://github.com/rajid, by chance are you running at channel width 40? And do you remember if you were seeing similar world regulatory updates prior to drops? Curious if maybe there's a combination there around channel 11 and the extra wide channel width... and what kind of router / AP are you using? Just looking for any commonality, since I am seeing this on channel 11 as well, as were others... My AP is a Ubiquiti.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/raspberrypi/linux/issues/1342#issuecomment-313242611, or mute the thread https://github.com/notifications/unsubscribe-auth/AFAlZVdfvh5QzIlsZYtt9sjpXolJqcmWks5sLAvdgaJpZM4HupC5.

https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png https://github.com/raspberrypi/linux https://github.com/raspberrypi/linux/issues/1342#issuecomment-313242611

I just performed more tests and changed router to which the Pi connects to.
Up to now, everything works when the Pi is on this other Wifi router (no change on Android device side) :
Working Router configuration :
Channel 6
WPA-PSK
Bandwidth 20Mhz

Non working router configuration (Wifi loss after some access from Android Wifi) :
Netgear WNR1000v2
Channel 6
WPA2-PSK [AES]
Fragmentation Length 2346
CTS/RTS Threshold 2347

I'll switch the working router to WPA2-PSK to see if the issue can then be reproduced.

@TheDiveO With regard to IPv6, the driver has different code paths for ipv6, as does the kernel. There could be a bug in either of those paths that is not in ipv6, or, as ISTR from a bug a while back, something runs a ipv6 codepath when it ought to run a ipv4 codepath or vica versa. The whole stack is fairly convoluted.

new behavior. Changing locale and doing apt-get upgrade and update now has the following behavior when my pi3 is connected via WIFI:

now devices outside of local LAN can connect to PI via TCP/IP.

PI still refuses all connections (TCP/IP) on the LAN only.

PI can still access the outside internet via WIFI.

never mind. Nothing has changed. This is the exact same behavior as before. Pi3 wifi drops all packets on local LAN.

Just to followup a little... I started up a new AP (Linksys E4200 V2), which I had lying around. I set it up on channel 11 for 2.4Ghz, configured WPA2 Personal, a BSSID and password. Then configured this on my raspberry pi zero w. It connected just fine. I then moved this AP to the same room where my normal house AP is located (which is on channel 6). My RaspPi then got ASSOC-REJECT status_code=16. Moving the AP back into my office once again made the RaspPi associate just fine.

So, it seems that in my case, at least, channel 11 is a problem if AP is in the other room. I'm guessing this probably indicates an interference problem.

I'll also post here a web page I found which tells what all of the status_codes and failure codes are:

https://supportforums.cisco.com/document/141136/80211-association-status-80211-deauth-reason-codes

This shows my "status_code=16" to be caused by a timeout, so one of the systems is simply not receiving packets in a timely fashion.

I just thought I'd throw this information out there in case it helps anyone.

When i turn on the lights on my kitchen it kills my wifi connection on the
living room ... i dunno why but when you talked about interference, i think
im not crazy

2017-07-12 16:27 GMT-03:00 rajid notifications@github.com:

Just to followup a little... I started up a new AP (Linksys E4200 V2),
which I had lying around. I set it up on channel 11 for 2.4Ghz, configured
WPA2 Personal, a BSSID and password. Then configured this on my raspberry
pi zero w. It connected just fine. I then moved this AP to the same room
where my normal house AP is located (which is on channel 6). My RaspPi then
got ASSOC-REJECT status_code=16. Moving the AP back into my office once
again made the RaspPi associate just fine.

So, it seems that in my case, at least, channel 11 is a problem if AP is
in the other room. I'm guessing this probably indicates an interference
problem.

I'll also post here a web page I found which tells what all of the
status_codes and failure codes are:

https://supportforums.cisco.com/document/141136/80211-
association-status-80211-deauth-reason-codes

This shows my "status_code=16" to be caused by a timeout, so one of the
systems is simply not receiving packets in a timely fashion.

I just thought I'd throw this information out there in case it helps
anyone.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-314872003,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ACeQBJdfk2zp1sReVVs1wvrilKHXNm53ks5sNR42gaJpZM4HupC5
.

There's a really nice WiFi Analyzer program for Android, which shows what APs are around you, along with their detailed information. (I wish something like that existed for iPhone/iPad, but Apple...)

@JamesH65 you are really making me uneasy saying that a data link layer driver (layer 3) does mess around with the network layer 3. "Mess" isn't probably an appropriate word for this situation either...

I'm not actually saying that. I'm not an expert on the Linux networks
stack, but I certainly seem to remember seeing some IPv6 specific stuff in
a driver somewhere.

The stuff is all in the kernel tree, you are welcome to take a look
yourself to put your mind at rest.

On 13 July 2017 at 08:58, TheDiveO notifications@github.com wrote:

@JamesH65 https://github.com/jamesh65 you are really making me uneasy
saying that a data link layer driver (layer 3) does mess around with
the network layer 3. "Mess" isn't probably an appropriate word for this
situation either...


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-315002002,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADqrHUSoqqxnhaw4k2ECkzGC9CDkIlhYks5sNc4ngaJpZM4HupC5
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

@TheDiveO James is refering to things like hardware checksum offload.
SMSC95xx for example can only support IPv4 checksum offload due to IPv6 requiring a checksum of 0x0000 being substituted for 0xFFFF. See https://github.com/torvalds/linux/commit/fe0cd8ca1b82983db24b173bb8518ea646c02d25. Hence IPv6 and IPv4 will be following different code paths. Nothing dubious there, but inherent in the network stack where the hardware can't cover all situations.

I'm pretty sure this bug is in the Broadcom driver, not the kernel.

Almost certainly. The Brcm driver is a big chunk of code though, and bugs
like this are not easy to debug.Especially when you cannot replicate them...

On 13 July 2017 at 13:04, Alexandre Bolelli notifications@github.com
wrote:

I'm pretty sure this bug is in the Broadcom driver, not the kernel.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-315058283,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADqrHbr5SiWPKvQZOY7rN8IbyIIscNfVks5sNgexgaJpZM4HupC5
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

The more I struggle with this, the more I'm starting to wonder if this is related to Ubuntu/Debians's inability to connect wlan0 and eth0 to the same subnet without some extensive configuration. I'm going to look into this more and see if this is the issue.

@JamesH65 would it help if me (or someone else) would set up zero w or rpi 3 for you in an environment where this is easily reproducible and give you ssh access there for you to debug? (I would need to buy extra zero w for this).

Probably not but thanks for the offer. I tend to run custom changes to the
driver and kernel, with changes made multiple times a day. Doing that
remotely isn't feasible. Mechanisms for reliably reproducing the issue are
really what is needed.

On 13 July 2017 at 13:57, Tuomas Airaksinen notifications@github.com
wrote:

@JamesH65 https://github.com/jamesh65 would it help if me (or someone
else) would set up zero w or rpi 3 for you in an environment where this is
easily reproducible and give you ssh access there for you to debug? (I
would need to buy extra zero w for this).


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-315069935,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADqrHeQ1RECH-uIIHWPX6ItvRdVbZG_Xks5sNhRWgaJpZM4HupC5
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

James,
If I do this simple polling loop, I quickly see the onboard wifi network degrade to unusable state. When I turn off the onboard wifi and use a USB wifi, it works find. I don't recall if it is sensitive to the BT device being present or absent, and can't easily check that for a few days. I limited to 10 minutes so I could get back into the pi zero w after the experiment.

bash# ((t=date +%s+600)); while [date +%s-lt $t ] ; do hcitool name <BTMAC>; done
Hope that helps,
Benjamin

That code snip lost my back ticks. Escaping them...

((t=` date +%s`+600)); while [ `date +%s` -lt $t ] ; do hcitool name "insert BT MAC"; done

OMG. I think it's fixed. Wifi is up when ethernet is unplugged. Unbelievable.

I removed all mention of eth0 from my /etc/network/interfaces file, replaced allow-hotplug with auto, and then forced wireless-power off on both wlan0 and wlan1.

my /etc/network/interfaces file:

auto lo
iface lo inet loopback

wireless-power off
auto wlan0
iface wlan0 inet manual
wpa-conf /etc/wpa_supplicant/wpa_supplicant.conf

wireless-power off
auto wlan1
iface wlan1 inet manual
wpa-conf /etc/wpa_supplicant/wpa_supplicant.conf`

Then I flushed arp:

ip -s -s neigh flush all

Then I rebooted:

sudo reboot now

And now my wifi works. Unbelievable. Thanks to all who commented on this thread.

Your particular configuration issue might be resolved, the bug in the Broadcom driver is still present.

OK, we've been looking at this. My first issue was that when SSH'ing in to my test device the session was locking up UNLESS the ethernet cable was also inserted. Turns out that ARP is handled by either interface, so when the ethernet was connected it was using that. Having it not connected meant it was being handled by the Wifi and encountering a problem. This problem could be fixed by turning off QoS/ToS in SSH (see here https://expresshosting.net/ssh-hanging-authentication/), which in turn implies that the Broadcom Wifi driver is very unhappy with the TOS(type of service)/DSCP field being set. This has been seen before in NTP (Issue #1519). I suspect that this could be a cause for the Wifi issues related on this issue and am going to have a dig through the Brcm driver today to see if I can find anything.

Interim report. We are definitely seeing issues with certain TOS packet values causing packets to be silently dropped, causing SSH lockups. Nothing obvious yet in the impenetrable driver code, which TBH shouldn't be touching this part of the packet anyway, but there is clearly something going on. Does this have anything to do with the general wlan freezes reported here? Don't know yet.

I have similar issues on a Pi Zero W with raspbian jessie and kernel 4.9.35+
I have the same issue mentioned by JamesH65 with SSH and ntpd (TOS). Fix from https://expresshosting.net/ssh-hanging-authentication/ worked for sshd. I also have the wlan0 disconnection issues, but with somewhat less verbose log messages. It superficially just looks like the carrier is lost, and wpa_supplicant sometimes fails to renegotiate. The only way out of that is to issue ifdown wlan0, wait, ifup wlan0 for me, then wlan0 starts working again. Happy to supply logs if anyone requires them. Just tell me which.

Interim report. Wanted to get some notes down before they get forgotten. We have determined that it is the response from the wirelessly connected pi that is going missing when accessing via SSH from another device. If that response has the TOS field set then the packet is silently dropped - never gets back to the requester. We can replicate this using netcat. Simple net cat command from the wireless Pi with the TOS flags set does not seem to make it out of the device.
So on the wireless PI, try and send a UDP packet to another device...
nc -T 0x10 -u 7
The device does not seem to receive the packet (as shown by running tcpdump on the destination)
nc -T 0x00 -u 7
will get to remote system.
We have only tried this over the wireless network here in the office. I need to set up another Wifi network to see if it is router related, or an issue in the driver.

Minor correction to the above netcat command
nc -T 0x10 -u <dest_ip> 7
UDP port 7 was chosen as it is the echo service. It doesn't matter that this isn't running on the remote machine, although that does lead to the appropriate ICMP unreachable response which is a useful tell-tale that the remote end got the message.

Beginning to think the SSH/ToS issue is actually unrelated. I've traced packets down to the HW level and it doesn't matter whether the TOS flags are set or not, the packets do seem to make it down to the firmware (or at least the brcmf_sdiod_send_pkt function which is past any priority handling in the linux driver). Which indicates the issues is either in the firmware in the chip (closed source), or actually router related - i.e. the wireless router I am using doesn't let through TOS flags that are non-zero (or perhaps > 0x04). I will try a different wireless router tomorrow to try and confirm this.

Is there any chance of locating the department responsible for developing the brcmfmac module so that someone can follow that thread or at least respond if any fix for these bugs will be released?

We are already in contact via the linux-wireless mailing list.

On 19 Jul 2017 19:06, "Alexandre Bolelli" notifications@github.com wrote:

Is there any chance of locating the department responsible for developing
the brcmfmac module so that someone can follow that thread or at least
respond if any fix for these bugs will be released?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-316469790,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADqrHYtYvpxdKd3SBBynOnlDN-ZXiWs_ks5sPkW9gaJpZM4HupC5
.

We are already in contact via the linux-wireless mailing list.

... and more direct routes. The issue has always been one of reproducibility - once we have a way of clearly demonstrating the problem we can present that to Broadcom/Cypress and get it fixed. I was never able to see the problem using NTP, but James has had a successful failure with SSH so I'm optimistic we'll get to the root cause.

@pelwell +1 for the term "_successful failure"_ :)

I have a hacky fix for the SSH locking up issue. It does appear to be an issue in the firmware. Here are some details.

`
We have been investigating an issue on the Raspberry Pi, where SSH and
NTP sessions are failing when the TOS flag is set in the IPv4 header.

Here's an extract on what TOS is:

TOS is 0x08 or 0x10. Only one of the 4 bits is allowed to be set at a time.
0x10 - minimise delay
0x08 - maximise throughput
0x04 - maximise reliability
0x02 - minimise monetary cost.
Technically TOS has been superceded by DSCP, but is still supported.

We could try recreating this issue with DSCP if really required, but it doesn't
appear to be relevant.

Details on the SSH issue, and a workaround can be found here https://expresshosting.net/ssh-hanging-authentication/

However, this is clearly an issue somewhere in the communications
stack, so this is what we have been investigating.

We have been able to replicate a simply example using netcat. Firstly,
connect a Pi wirelessly to an AP (PiA), with another device connected
either wirelessly or via ethernet to the same network (PiB).

On PiB run

sudo tcpdump -n 'udp port 7' -v -i wlan0 <<<< or eth0 depending on connection

On PiA,

nc -T 0x10 -u 7

This sends a UDP packet to port 7, with the TOS flag set to 0x10.

This will NOT arrive (or sometime be very badly delayed - 10's of seconds)

Sending TOS as 0

nc -T 0x0 -u 7

WILL arrive. 0x02 and 0x04 will also arrive, 0x8 and 0x10 will not.

Instrumenting the brcmfmac driver shows that the packet with the TOS
flag = 0x10 is correctly sent down the stack to the HW, but then the
packet goes missing.

We have been able to get the packet through by hacking the BCDC code,
in the bcdc.c!brcmf_proto_bcdc_hdrpush function, the priority of the
packet is also pushed in to the bcdc header. By setting this to a
constant value (which could be anything from 0-7), the packet is
transmitted. So it seems that a constant value for bcdc priority
works, but having it set to the priority as determined by the incoming
skb priority things fail IF the TOS is 0x08 or 0x10. So, it seems to
be a combination of packets with varying priorities that causes
higher priority values to fail, not the value itself.

Since the BCDC header priority is destined for the firmware, this
would appear to be a problem in the firmware itself, not the Linux
driver.

Here is a diff of the change that appears to stop the issue happening.

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcdc.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcdc.c index 9f2d0b0cf6e5..2e6132a513be 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcdc.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcdc.c @@ -274,7 +274,7 @@ brcmf_proto_bcdc_hdrpush(struct brcmf_pub *drvr, int ifidx, u8 offset, if (pktbuf->ip_summed == CHECKSUM_PARTIAL) h->flags |= BCDC_FLAG_SUM_NEEDED; - h->priority = (pktbuf->priority & BCDC_PRIORITY_MASK); + h->priority = 0; h->flags2 = 0; h->data_offset = offset; BCDC_SET_IF_IDX(h, ifidx);

@JamesH65 Great. Since i don't expect a soon firmware fix, could you please copy this to linux-wireless?

I'm going to wait for some information from Broadcom/Cypress, since I am
not sure this hack is safe in all circumstances. I have emailed them. Once
I get some feedback I'll send a patch to linux-wireless.

On 20 July 2017 at 12:41, Stefan Wahren notifications@github.com wrote:

@JamesH65 https://github.com/jamesh65 Great. Since i don't expect a
soon firmware fix, could you please copy this to linux-wireless?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-316678154,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADqrHY3DlxTr9mehRDlxBK3NWbjowxxyks5sPzzqgaJpZM4HupC5
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

Some test results seems to indicate no ill effects from this hack. Just been transferring data back and forth, 500MB to the Wireless Pi, 3.4GB sent out. RX packets 56 dropped from 794730, no TX packets dropped from 2813930. Performance seems spot on for an 11Mbit/s connection. So looking acceptable, but this hack actually disables something that should probably be enabled, so not a long term solution.

@lategoodbye Been pondering about pushing this to linux-wireless. Since this hack is really only relevant/tested to the particular chip on the Pi (BCM43438?), and the driver code is for multiple chip models, the patch would need to determine the chip type being used before making the change, and I suspect that linux-wireless would be unhappy with that sort of change and I would be unable to test it anyway. I'm definitely going to PR it on our repo (unless a firmware fix is coming, I doubt that in a sensible schedule). Just not sure how to push it on to linux-wireless, if at all.

@moonman
Do you think, this could be pushed to ARCH linux-raspberrypi?

@JamesH65 Sure, your hack isn't suitable for all chip models. But it's your not your job to find a solution for all of them. I think a simple copy of your long comment above (including hack) would be sufficient. My intension was to inform other non-broadcom kernel developer about the issue. I didn't expect you to send a proper patch for this issue, only a bug report.

I suggest we get it into our repo to get some serious testing done - start with rpi-4.12.y which is used by nightly cutting-edge LibreElec builds.

One thought - could you make the patch more selective in its priority filtering and still fix the problem?

I'm just preparing a PR for this to go the Pi repo.

With regard to selective checking, I did try with simply detecting priority
6 (the one that is passed down the stack - it's translated from the TOS
value to something more Linux stack specific), and setting that to 0 and
that did seem to work, but my suspicion is that it is a combination of
different priorities rather than specifically 6 that causes the problem. We
also know that a TOS of 0x08 also has problems, and that is, IIRC,
converted to 2 by the time it gets to this point. We could simply say, if
its 6 or 2, then set it to zero, but I am still not sure that would catch
everything that may cause issues. Since the value is 0-7 anyway, I reckon,
for this hack, its better to just set to 0 in all cases. We know that
works, it may not be optimal of course, but I think that all packets would
get through. Note that this setting does NOT affect the TOS value in the
IPv4 packet - that remains the same, it's just this system of sending the
priority to the chip and how it then handles it that appears flaky.

On 21 July 2017 at 09:35, Phil Elwell notifications@github.com wrote:

One thought - could you make the patch more selective in its priority
filtering and still have fix the problem?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_raspberrypi_linux_issues_1342-23issuecomment-2D316940828&d=DwMCaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=bmJYpA4c2HSXiPbO68JUYdepjN1tnBs_lkuzpPvnoh4&s=lTkmZTnZKvmqZQgONBOnkdo5C-y1dP_Z61sUY17WvV0&e=,
or mute the thread
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADqrHYglVaRlIj07b13KHHEPd43W9kiLks5sQGLWgaJpZM4HupC5&d=DwMCaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=bmJYpA4c2HSXiPbO68JUYdepjN1tnBs_lkuzpPvnoh4&s=QrCSx1NLJWIkcH1C1mIZRxSCuySlqHXvu_Mpn37WdPw&e=
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

I have had some contact with Cypress, who are going to try and get this
looked at asap.

On 21 July 2017 at 10:11, James Hughes james.hughes@raspberrypi.org wrote:

I'm just preparing a PR for this to go the Pi repo.

With regard to selective checking, I did try with simply detecting
priority 6 (the one that is passed down the stack - it's translated from
the TOS value to something more Linux stack specific), and setting that to
0 and that did seem to work, but my suspicion is that it is a combination
of different priorities rather than specifically 6 that causes the problem.
We also know that a TOS of 0x08 also has problems, and that is, IIRC,
converted to 2 by the time it gets to this point. We could simply say, if
its 6 or 2, then set it to zero, but I am still not sure that would catch
everything that may cause issues. Since the value is 0-7 anyway, I reckon,
for this hack, its better to just set to 0 in all cases. We know that
works, it may not be optimal of course, but I think that all packets would
get through. Note that this setting does NOT affect the TOS value in the
IPv4 packet - that remains the same, it's just this system of sending the
priority to the chip that appears flaky and how it then handles it that
appears flaky.

On 21 July 2017 at 09:35, Phil Elwell notifications@github.com wrote:

One thought - could you make the patch more selective in its priority
filtering and still have fix the problem?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_raspberrypi_linux_issues_1342-23issuecomment-2D316940828&d=DwMCaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=bmJYpA4c2HSXiPbO68JUYdepjN1tnBs_lkuzpPvnoh4&s=lTkmZTnZKvmqZQgONBOnkdo5C-y1dP_Z61sUY17WvV0&e=,
or mute the thread
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADqrHYglVaRlIj07b13KHHEPd43W9kiLks5sQGLWgaJpZM4HupC5&d=DwMCaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=bmJYpA4c2HSXiPbO68JUYdepjN1tnBs_lkuzpPvnoh4&s=QrCSx1NLJWIkcH1C1mIZRxSCuySlqHXvu_Mpn37WdPw&e=
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

We also know that a TOS of 0x08 also has problems, and that is, IIRC, converted to 2 by the time it gets to this point.

Correct. TOS 0x08 (maximise throughput) mapped to 2. They are TC_PRIO_xxx values from http://elixir.free-electrons.com/linux/latest/source/include/uapi/linux/pkt_sched.h#L19. 6=INTERACTIVE, 2=BULK.

Previous testing with either setting IPQoS to 8 in sshd_config, or with netcat using TOS 8 resulted in dropped packets.
Neither 0x02 nor 0x04 caused any issues, but there's little a wifi driver can do over cost difference (there is none) or reliability so is probably ignoring them.
edit Actually the mapping table at http://elixir.free-electrons.com/linux/latest/source/net/ipv4/route.c#L177 taking tos>>1 is setting TOS 0x02 and 0x04 to TC_PRIO_BESTEFFORT = 0 anyway, which explains why they don't have any issues.

Just a quick report. Cypress have been able to replicate the issue, and are
checking the firmware so looking hopeful. Very pleasing and rapid response
from the guys there.

On 21 July 2017 at 11:07, 6by9 notifications@github.com wrote:

We also know that a TOS of 0x08 also has problems, and that is, IIRC,
converted to 2 by the time it gets to this point.

Correct. TOS 0x08 (maximise throughput) mapped to 2. They are TC_PRIO_xxx
values from http://elixir.free-electrons.com/linux/latest/source/
include/uapi/linux/pkt_sched.h#L19. 6=INTERACTIVE, 2=BULK.

Previous testing with either setting IPQoS to 8 in sshd_config, or with
netcat using TOS 8 resulted in dropped packets.
Neither 0x02 nor 0x04 caused any issues, but there's little a wifi driver
can do over cost difference (there is none) or reliability so is probably
ignoring them (I haven't checked).


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-316962443,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADqrHXTmjzqVW0o4T9IIoYFPprKvEvS7ks5sQHhXgaJpZM4HupC5
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

Easier way of reproducing - use ping! (I'd forgotten that ping/ICMP was above IP - silly me)

ping -Q 0x10 <dest ip addr> on the Pi3
and run tcpdump -n -v -i wlan0 'icmp' on the destination.
Results in >90% packet loss on -Q 0x10 or -Q 0x08. It's often started OK with 4 sequential packets getting through, but then goes very intermittent.
It's slightly more useful than netcat as (a) it keeps on repeating, and (b) it tells you when it gets a response.

There is a workaround here: https://github.com/raspberrypi/linux/pull/2126
If you want to test it with 4.9 kernel, use rpi-update.
Then replace:
modules/4.9.39+/kernel/drivers/net/wireless/broadcom/brcm80211/brcmfmac/brcmfmac.ko with this
modules/4.9.39-v7+/kernel/drivers/net/wireless/broadcom/brcm80211/brcmfmac/brcmfmac.ko with this

EDIT: Latest rpi-update kernel now includes the patch, so downloading the modules is no longer required.

Not sure if related. Connection on the on-board broadcom on my Pi Zero W drops every 2 hours when a second interface wlan1 is up with a rt8192eu/8192eu dongle. It does not appear to be a power issue because it is very cyclical, I have a pastebin of the disconnects at https://pastebin.com/5hMQHWeW

When this is ongoing, wpa_supplicant sometimes gives up trying for no obvious reason other than failure to authenticate and the only way to get connectivity back on wlan0 is to issue ifdown/ifup which then works 100%.

Now I don't know if this is the related Broadcom kernel module problems causing issues, or if it's the buggy 8192eu or both. Happy to supply more lines of logs if needed or post in a different thread, but someone on #raspbian suggested I add this here.

If you can confirm that vcgencmd get_throttled returns 0x0 after a disconnection that will rule out a power problem.

Usually happens when I'm asleep/not with the Pi and I find out in retrospect when I can't connect to it anymore (then I used to connect through the second AP and reset wlan0). However, since the 8192eu dongle is unplugged now, haven't had an event. I can plug in the second dongle with the buggy module, but how soon after the disconnect to I need to check vcgencmd get_throttled?

As long as you haven 't rebooted the upper bits will tell you if there has ever been an under-voltage event.

Just ran it. Definitely not rebooted since last disconnect. Can confirm vcgencmd get_throttled returns:
throttled=0x0

Unfortunately get_throttled won't work on a Pi0/Pi0w (doesn't have the under-voltage detection circuit).

For some reason copy&pasting the the diff from JamesH65 didn't work for me. Made a patchfile that should apply right away, figured people might find this useful: https://github.com/bortek/EZ-WifiBroadcast/blob/master/kernel/linux-4.9.28-brcmfmac-tos.patch

Filename says 4.9.28, but should apply atleast up to 4.9.35 (and probably later ones also).

Copy this file to the kernel tree root directory and apply with patch -p1 < linux-4.9.28-brcmfmac-tos.patch

Additional (but odd) information:

If the Pi Zero W is connected to wlan0, but otherwise doing nothing (cron script checking sntp every 15 minutes at most), there are very frequent disconnects, something in the order of 1-10/hour lasting at most a second each.

If I had something using the connection, though, for example idling on IRC (multiple large channels), the connection does not drop a single time throughout the whole time this is the case.

Turns out loading the 4.9.39 kernel modules on 4.9.35 was not a good idea.

Another bug report from the forum, mailbox error seems common.

https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=189046

Latest rpi-update kernel now includes the BCDC priority patch.

Cypress (was Broadcom) have given us new releases of the WiFi and Bluetooth firmwares to test. You can download a pre-release here. After downloading to your Pi, run:

tar zxvf brcmfw_170808.tgz
cd brcmfw_170808
./brcmfw -i

This will extract then install the new firmware (the old versions will be backed up first).

To revert to the original firmware (which I recommend you do before installing a proper release):

./brcmfw -u

What’s changed:

  1. CVE-2017-9417: “Broadpwn” issue fix
  2. Add “CY” string in the version string.
  3. AMPDU sequence number deadlock fix (potential fix for this issue)
  4. CLM version upgrade
  5. CVE-2017-0572: memory corruption fix

Just a side note - I disabled internal wifi on my first Pi Zero W and switched to an USB wifi dongle, all problems gone. Some days ago I installed another Pi Zero W to control my 3D printer (using OctoPi). I was a bit surprised to see that the internal wifi seems to work flawlessly - but after some tests I can confirm that wifi breaks as soon as I connect from my LG G4 Android phone (Chrome browser). When I think about it, I guess the behaviour on the first Pi was quite similar...
Connection from my PC does not lead to such effects.

Please try the new firmware and report back with your findings.

i installed the preview firmware. I still get "raspberrypi kernel: brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content:" error, followed by wifi failure.

What is your use case?

same as:
https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=189046

trying the working config posted there. I will update.

Please provide your kernel version, a summary of the connected devices and long it takes until the error appears.

The mailbox error is still under investigation, I'm not expecting this
firmware to fix it. There is more debugging in this firmware to help track
it down though. If you enable the driver debugging (sorry, on mobile and
don't have details on how to do that) and see the error, then dumping the
debug and posting details here when you get the mailbox error would be
useful.

On 13 Aug 2017 21:40, "Stefan Wahren" notifications@github.com wrote:

Please provide your kernel version, a summary of the connected devices and
long it takes until the error appears.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-322062745,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADqrHRwsQyHa-QqOP7ntTqgCfWlgXpEqks5sX1DlgaJpZM4HupC5
.

The debugging is disabled by default and needs a module rebuild to enable (perhaps we should change that during these investigations). The changes required are the addition of BRCMDBG=y to .config followed by a rebuild, then add brcmfmac.debug=0x???????? to /boot/cmdline.txt, where ???????? is a hexadecimal number comprising the bit values documented here: https://github.com/raspberrypi/linux/blob/rpi-4.9.y/drivers/net/wireless/broadcom/brcm80211/brcmfmac/debug.h#L22

Tried the test firmware posted by pelwell, the issue still persist. Connection freezes every 1 - 2 hours. When the connection dropped and I tried to ping (ping 8.8.8.8), it's working again _briefly_ until the 8th ping. The ping behavior is consistent across freezes. Working-> freeze-> ping 8.8.8.8-> working ->8th ping -> freezes After that, I need to reboot my raspberry pi. Don't know if it's help though..

Kernel:
Linux raspberrypi 4.9.41-v7+ #1023 SMP Tue Aug 8 16:00:15 BST 2017 armv7l GNU/Linux

Firmware:
BT: test_170808
WiFi bin: test_170808
WiFi txt: test_170808

Anything relevant in dmesg when it happens?

On 14 Aug 2017 13:16, "GIlang Charismadiptya" notifications@github.com
wrote:

Tried the test firmware posted by pelwell, the issue still persist.
Connection freezes every 1 - 2 hours. When the connection dropped and I
tried to ping (ping 8.8.8.8), it's working again briefly until the 8th
ping. After that, I need to reboot my raspberry pi.

Kernel:
Linux raspberrypi 4.9.41-v7+ #1023
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_raspberrypi_linux_issues_1023&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=AKaU_LFRmDMObaVb2VxPhT3pS6_Sd6Qnrtg_9TSH5pc&s=OFVHPpEIYXIdyZoaKEmVcXWxHk2O53Mv7nB_Kp-jNnI&e=
SMP Tue Aug 8 16:00:15 BST 2017 armv7l GNU/Linux

Firmware:
BT: test_170808
WiFi bin: test_170808
WiFi txt: test_170808


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_raspberrypi_linux_issues_1342-23issuecomment-2D322164546&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=AKaU_LFRmDMObaVb2VxPhT3pS6_Sd6Qnrtg_9TSH5pc&s=lhUPrFZ2Xcg2O_gDeznrblSKqMffIk4hXHFaUrCfNIc&e=,
or mute the thread
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADqrHej12v-2DqQEMPe4n2TBq-5F5VyQgq2Iks5sYCyMgaJpZM4HupC5&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=AKaU_LFRmDMObaVb2VxPhT3pS6_Sd6Qnrtg_9TSH5pc&s=-6r_-x8_9PHhc0q5uJZcGsxdyCROGK7EhGQyp3scT8U&e=
.

Nope, nothing interesting. Maybe because I haven't rebuild the module with debugging support. How to do it? or will you provide the compiled module? Thanks.

Attached below are the dmesg logs:

`pi@raspberrypi:~ $ sudo dmesg

[    4.654722] brcmfmac: Firmware version = wl0: Aug  7 2017 00:46:29 version 7.45.41.46 (r666254 CY) FWID 01-f8a78378
[    5.752968] smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[    5.753285] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[    6.206530] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[    6.206577] brcmfmac: power management disabled
[    7.088933] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[    7.340040] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[    7.340841] smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0xCDE1
[    7.431235] Adding 102396k swap on /var/swap.  Priority:-1 extents:4 across:217088k SSFS
[   10.182342] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   10.182357] brcmfmac: power management disabled
[   10.872838] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   10.872903] brcmfmac: power management disabled
[   11.594201] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[   14.128592] ip_tables: (C) 2000-2006 Netfilter Core Team
[   14.172268] nf_conntrack version 0.5.0 (15360 buckets, 61440 max)
[   54.604680] random: crng init done

pi@raspberrypi:~ $ sudo dmesg -l err
[    4.501055] raspberrypi-touchscreen 3f700000.dsi.0: Unknown Atmel firmware revision: 0xfa
`

See Phil's post above for details on the debug module. We are particularly
interested in the debug trace when the mailbox error occurs.

On 14 Aug 2017 17:52, "GIlang Charismadiptya" notifications@github.com
wrote:

Nope, nothing interesting. Maybe because I haven't rebuild the module with
debugging support. How to do it? or will you provide the compiled module?
Thanks.

Attached bellow the dmesg log:

`pi@raspberrypi:~ $ sudo dmesg

[ 4.654722] brcmfmac: Firmware version = wl0: Aug 7 2017 00:46:29 version
7.45.41.46 (r666254 CY) FWID 01-f8a78378
[ 5.752968] smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote
wakeup
[ 5.753285] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 6.206530] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[ 6.206577] brcmfmac: power management disabled
[ 7.088933] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[ 7.340040] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 7.340841] smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa
0xCDE1
[ 7.431235] Adding 102396k swap on /var/swap. Priority:-1 extents:4
across:217088k SSFS
[ 10.182342] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[ 10.182357] brcmfmac: power management disabled
[ 10.872838] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[ 10.872903] brcmfmac: power management disabled
[ 11.594201] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[ 14.128592] ip_tables: (C) 2000-2006 Netfilter Core Team
[ 14.172268] nf_conntrack version 0.5.0 (15360 buckets, 61440 max)
[ 54.604680] random: crng init done

pi@raspberrypi:~ $ sudo dmesg -l err
[ 4.501055] raspberrypi-touchscreen 3f700000.dsi.0: Unknown Atmel firmware
revision: 0xfa
`


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-322228992,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADqrHXuy3Eo5PqPAP8FfSFiYWMUQL7fAks5sYG1HgaJpZM4HupC5
.

Latest rpi-update kernel enables BRCMDBG which should allow the brcmfmac.debug=0x???????? command line option suggested by @pelwell earlier.

Errrr..... my Pi3 who was rock solid with wifi now also loses it since I upgraded to the latest raspbian a few days ago :-(

What are the symptoms? I would not expect a regression in the firmware, or
indeed the driver itself.

On 24 August 2017 at 20:07, Crrispy notifications@github.com wrote:

Errrr..... my Pi3 who was rock solid with wifi now also loses it since I
upgraded to the latest raspbian a few days ago :-(


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-324728431,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADqrHUxvLV3OzKGpcmEMGEoSad_piujBks5sbcoHgaJpZM4HupC5
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

@Crrispy
Try this:
I removed all mention of eth0 from my /etc/network/interfaces file, replaced allow-hotplug with auto, and then forced wireless-power off on both wlan0 and wlan1.

my /etc/network/interfaces file:

auto lo
iface lo inet loopback

wireless-power off
auto wlan0
iface wlan0 inet manual
wpa-conf /etc/wpa_supplicant/wpa_supplicant.conf

wireless-power off
auto wlan1
iface wlan1 inet manual
wpa-conf /etc/wpa_supplicant/wpa_supplicant.conf`

Then I flushed arp:

ip -s -s neigh flush all

Then I rebooted:

sudo reboot now

I'm pretty sure that I'm encountering this bug on a regular basis. Running hostapd, using the internal Broadcom wifi to host an access point, and routing clients that connect to it through a USB wifi dongle that serves as a wireless client. Multiple devices are connected, but as soon as I carry the Pi out of range of the connected devices, the wlan crash seems to occur. As with others, it's only the internal broadcom wlan device that goes down: ethernet and the other wlan remain unaffected. I'm also getting the "mailbox" error in the system log:

Aug 27 08:34:38 raspberrypi kernel: [40063.859420] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012

(more log details at https://pastebin.com/NPB00ZEq )

I had noticed that the output of iwconfig no longer shows the Tx_Power value when the wlan device is in its failed state, so I've used that to script an automatic reboot as a workaround in the meantime.

I've just updated to the latest rpi-update, installed the test wifi drivers referenced above, and added the debug flag to my cmdline.txt, using the hex value for BRCMF_TRACE_VAL: bcrmfmac.debug=0x00000002

If you can regularly get the mailbox error, we would be grateful for the results from the debug driver. When you get the mailbox error, please do something like this to get the forensics and post the results here, I can them pass on to Cypress who are investigating the issue.

cat /sys/kernel/debug/brcmfmac/mmc1\:0001\:1/forensics

Well, what was an easily reproduced issue I am no longer able to duplicate, since running rpi-update. I might be able to make it happen by downgrading back to a fresh install of the Raspbian build from 21 June 2017, if that would be helpful.

@JamesH65
Managed to capture the forensics that you had requested (after the mailbox error), but to be clear, this is after downgrading to the kernel included in the 21 June Raspbian build. It might well have already been resolved, because I've yet to duplicate the issue after installing the test firmware posted by @pelwell about two weeks ago, and running rpi-update.

Here's a link to the forensics:
https://pastebin.com/VVqVQ8FW

Hope that's helpful...

So with the old firmware I suspect. We are hoping to get the forensics for
the new firmware which has extra messages (apparently) aimed at tracking
down the mailbox issue. This makes me think Cypress still seem to think the
mailbox issue will be there, even after the other fixes. Will pass on the data anyway just in case it helps.

Good to know that errors are much harder to reproduce!

On 29 August 2017 at 15:51, randyoo notifications@github.com wrote:

@JamesH65
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_jamesh65&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=owdl09j03eJ21jjmS-pXzxuHC0FQIGtHaCHVAUCN42I&s=3RXFuPnppW2lu6j302oN0bZFkwDQhfTLIZ4fb-qzMds&e=
Managed to capture the forensics that you had requested, but to be clear,
this is after downgrading to the kernel included in the 21 June Raspbian
build. It might well have already been resolved, because I'm unable to
duplicate the issue after installing the test firmware posted by @pelwell
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_pelwell&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=owdl09j03eJ21jjmS-pXzxuHC0FQIGtHaCHVAUCN42I&s=OEna5EdFdm9tLu51AyYXqp_FN2kYCjSiEmIG7OTV8yI&e=
about two weeks ago.

Here's a link to the forensics:
https://pastebin.com/VVqVQ8FW
https://urldefense.proofpoint.com/v2/url?u=https-3A__pastebin.com_VVqVQ8FW&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=owdl09j03eJ21jjmS-pXzxuHC0FQIGtHaCHVAUCN42I&s=05AD-plLg4D-_tU_7DpsL3d-tOtWDjbQs62eqP9W9gg&e=

Hope that's helpful...


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_raspberrypi_linux_issues_1342-23issuecomment-2D325689126&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=owdl09j03eJ21jjmS-pXzxuHC0FQIGtHaCHVAUCN42I&s=0aM55qLQhMgI2neXi8qVWOJ4FNsV4VlNCOyxI3AW_2c&e=,
or mute the thread
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADqrHZObdWpcetcTECfa0dqKXJPMWiS1ks5sdCVxgaJpZM4HupC5&d=DwMFaQ&c=DpyQ_ftY536pf7wCBQXXU58xADDRY77THQzJu1OmzOo&r=w09_2ePv8G3zRjoV19Wm1Q6rI7CDlOns4PuRv2hHkek&m=owdl09j03eJ21jjmS-pXzxuHC0FQIGtHaCHVAUCN42I&s=nNj0tSkc_hIjXqC-9GAp1TcD06OXO70Ivwzo_EdWB1E&e=
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

This makes me think Cypress still seem to think the
mailbox issue will be there, even after the other fixes.

Yes, that is my understanding too.

@randyoo Thanks for the positive feedback.

@JamesH65
Okay, it happened again, this time on the latest rpi-update firmware, and using the test firmware posted by @pelwell . Unfortunately, the forensics output looks identical to the one in the previous post. (Not sure why I'm not getting different/more detailed info in the forensics dump, since I do have debugging enabled in my cmdline.txt, as per my previous post)

I did go ahead and dump the contents of the other /sys/kernel/debug stuff, too. Here it is: https://pastebin.com/pdFUPBxN

On the last wlan freeze, the kernel log trace appears to be more detailed. See the link:
https://pastebin.com/KTxbgpYV

Hope that helps.

Right, sorry. Managed to get a capture of the forensics, and yes, there appears to be much more detail there:
https://pastebin.com/qypfAfAp

As sometimes having new cases help, I get it also from time to time:

pi@jempi:~ $ grep "brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012" /var/log/syslog
Aug 14 22:16:23 jempi kernel: [ 501.247242] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
Aug 17 20:26:20 jempi kernel: [ 509.684277] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
Aug 24 23:57:37 jempi kernel: [ 573.652189] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
Aug 29 23:50:16 jempi kernel: [ 5052.517999] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
Aug 30 00:02:18 jempi kernel: [ 170.978988] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
Aug 30 23:58:03 jempi kernel: [ 8254.502431] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
Sep 2 00:33:28 jempi kernel: [ 5979.773944] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012

I'm using the internal wifi (wlan0) as AP and plugged a dongle (wlan1) to connect to my router:
pi@jempi:~ $ ifconfig wlan0
wlan0 Link encap:Ethernet HWaddr b8:27:eb:cf:db:b8
inet addr:10.3.141.1 Bcast:10.3.141.255 Mask:255.255.255.0
inet6 addr: fe80::6b56:4657:75cd:a501/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:30 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:5492 (5.3 KiB)

pi@jempi:~ $ ifconfig wlan1
wlan1 Link encap:Ethernet HWaddr 00:60:b3:db:8a:4a
inet addr:192.168.1.74 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::260:b3ff:fedb:8a4a/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1358 errors:0 dropped:2 overruns:0 frame:0
TX packets:789 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:256652 (250.6 KiB) TX bytes:215250 (210.2 KiB)

I had kernel 4.9.35-v7+ and upgraded it yesterday to 4.9.46-v7+ (with rpi-update) but does not help. Input from syslog when it fails:

Sep 2 00:33:28 jempi kernel: [ 5979.773944] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
Sep 2 00:34:00 jempi kernel: [ 6011.624839] brcmfmac: brcmf_netdev_wait_pend8021x: Timed out waiting for no pending 802.1x packets
Sep 2 00:34:02 jempi kernel: [ 6014.184823] brcmfmac: send_key_to_dongle: wsec_key error (-110)
Sep 2 00:34:05 jempi kernel: [ 6016.744833] brcmfmac: brcmf_cfg80211_del_station: SCB_DEAUTHENTICATE_FOR_REASON failed -110
Sep 2 00:34:06 jempi kernel: [ 6017.704831] brcmfmac: brcmf_netdev_wait_pend8021x: Timed out waiting for no pending 802.1x packets
Sep 2 00:34:08 jempi kernel: [ 6020.264850] brcmfmac: send_key_to_dongle: wsec_key error (-110)
Sep 2 00:34:11 jempi kernel: [ 6022.824903] brcmfmac: brcmf_cfg80211_change_station: Setting SCB (de-)authorize failed, -110

Restart wlan0 interface with sudo ifconfig wlan0 down and then up didn't help.

@bulrog Please also provide the forensics as explained by James above.
Which driver does wlan1 use? Does this issue still occur with unplugged dongle?

A few more forensics captures:
https://pastebin.com/vqh3UcF3

In case this helps Cypress look in the right area: I've experienced this issue many, many times now, and it seems to manifest itself whenever a device attempts to connect. It's happened many times after walking into range of the AP, or when a sleeping device is awoken.

I've kept this configuration long enough to capture forensics, and if there's any more detail I can provide, I'd be happy to do so, but the wlan crashes are now happening so frequently that it renders my device useless. I intend to use another USB wifi dongle to replace the internal radio, in order to attain reliability.

I've passed your most recent forensics on to Cypress - thanks for taking the time.

Just wanted to chime in. I have the exact same issue on three RPI3s running the newest firmware. I am using Octopi on all three and accessing them via Printoid.

bcrmfmac.debug should be brcmfmac.debug (thanks for spotting @MilhouseVH)
I'll edit the earlier posts.

bcrmfmac.debug should be brcmfmac.debug (thanks for spotting @MilhouseVH)
I'll edit the earlier posts.

Based on this, I assumed that the forensics I captured weren't complete.

I've repeated the forensics capture, it can be perused at the following URL:
https://pastebin.com/ha5rd7SW

In addition, my /var/log/kern.log file is near 200MB in size, most of it consisting of very similar entries. I located the mailbox error at 00:53:19, and snipped a couple seconds before and after the error. Hopefully it helps, see it here:
https://pastebin.com/JcE0zstS

so i think i have found the same issue, see https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=192735

i can reproduce it within 5 minutes. You need high amount of traffic over wifi (camera webinterface) and very low wifi signal. I have pi zero and its enough to put your finger/hands around the onboard antenna to get the signal to almost zero (my router shows 15-20% signal). After about 1 minute in this state the wifi crashes

@lategoodbye after a week out I turned my pi ON and no issue as long as nothing used the AP and after a while got issue when connected with my phone to wlan0. I run the command and outcome can be found here: https://pastebin.com/77tGfRcU

for wlan1, I used a pretty old dongle. Cannot remember which driver I had to install to get it working but this is what lsusb gives for the HW I use:

Bus 001 Device 005: ID 0cde:0008 Z-Com XG-703A 802.11g Wireless Adapter [Intersil ISL3887]

I don't know if that helps but here is my experience:

I bought a Pi3 and tested it for a few days with internal wifi (not far away from the AP), and it seemed to work pretty well (I was not expecting high bitrates, it just had be be useful for a remote shell via ssh).

After putting it into an aluminium case it still seemed ok first, but then wifi kept becoming unusable randomly. Up to minutes of no ping coming through. There were still occasions when it worked very well for some seconds but it switched to the "one keystroke per second" experience again or stopped working altogether.

It seems there is no "slow but usable" connection possible only a "very good one" or a "unusable one". This might be due to a bug in the firmware. I have no idea and frankly I lost patience and use a very tiny usb dongle instead which works 100% stable.

Has anyone found a workaround to detect the issue (in AP mode) and programmatically reset the wlan device?

Not that I saw, restarting interface didn't help. For me the containment was to buy an external wifi usb device and it works like a charm but it is a pitty as now I turned off the wifi of the pi (sigh!)

Yes the mailbox issue. I hope it get fixed but as containment I had to switch to external device.

OK. We are at the mercy of Cypress on this one - it's a firmware issue, and they are the only ones with access. I'll keep reminding them.....we may need some more forensics, but will post here if that is the case.

my wlan disconnects and reconnects after a few seconds of inactivity (I assume this is powersaving, even though i disabled it with iw, or maybe interference). Not sure if this is the same issue as discussed here (as it reconnectes immediately).

If I connect with ssh -o ServerAliveInterval 5 ... it doesn't disconnect anymore.

$ uname -a
Linux pi3 4.4.50-hypriotos-v7+ #1 SMP PREEMPT Sun Mar 19 14:11:54 UTC 2017 armv7l GNU/Linux

@asssaf,
Not the issue, if it reconnected it would generally only be a latency issue, but when running headless over WiFi (one the main potential features of the PiZero-W) when the WiFi drops out and doesn't automatically reconnect, the system has crashed for all practical purposes.

Even if I have HDMI, mouse, and keyboard under heavy network loads like with Motioneye, sometimes the only way to recover is to power cycle.

I have repeated the installation and configuration of Motioneye on a Pi2 with a WiPi USB WiFi dongle and so far its worked perfectly with loads that reliably killed the PiZero-W in a few hours. To me this seems to confirm its a WiFi chip/driver problem and not an issue with Raspbian-stretch.

@PeterTheMaster1 @randyoo @joshfria

OK, message for anyone seeing the mailbox issue regularly, and would be able to test something for me.

We have a diagnostic firmware from Cypress, which may help track down the problem. If anyone with the mailbox issue would be willing to run this firmware, and when the mailbox issue occurs, dump the forensics, and post the results on here, that will be of great help. Note, this firmware should not be used for anything else but this test as it will be 'non-optimal'! Please comment here if you are able to do the test, and I will get in contact with the firmware and instructions.

@iurly : I wrote a script that would detect the problem, and then reboot, since bringing the interface down and up didn't help... Bu then it was rebooting so often, that I could only get a useful device by taking it out of AP mode (and assigning AP duties to my USB dongle)

@JamesH65 : I'd be happy to help out, as before. Is it a new version of the diagnostic firmware? I posted a forensics capture 3 weeks ago (on this issue page) using the diagnostic/debugging firmware posted earlier on this page.

Yes, new firmware from Cypress as of Monday 25th Sept. It has more
diagnostics in it. The previous forensics you supplied have narrowed down
the issue, they need a bit more detail though. I'm been running a machine
for 24hrs so far with no mailbox error, so currently unable to replicate it
myself.

Can you email me on james.[email protected] and I can get the firmware to you. I don't want to publicise it more globally because it really is just for testing purposes.

On 27 September 2017 at 14:48, randyoo notifications@github.com wrote:

@iurly https://github.com/iurly : I wrote a script that would detect
the problem, and then reboot, since bringing the interface down and up
didn't help... Bu then it was rebooting so often, that I could only get a
useful device by taking it out of AP mode (and assigning AP duties to my
USB dongle)

@JamesH65 https://github.com/jamesh65 : I'd be happy to help out, as
before. Is it a new version of the diagnostic firmware? I posted a
forensics capture 3 weeks ago (on this issue page) using the
diagnostic/debugging firmware posted earlier on this page.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-332526471,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADqrHW_vEVuxFD-9RuxE003QZc_2NoFaks5smlIjgaJpZM4HupC5
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

@JamesH65 If you would be so kind as to provide a link to the new firmware, I can install it and try to capture forensics again, as you'd requested.

Unfortunately, providing a link here means its publicly available, and
since this is very much a test firmware I'd rather it didn't escape in to
the wild. Hence request to do via email. If that is a problem, I'll upload
it somewhere and can post a link.

On 27 September 2017 at 15:56, randyoo notifications@github.com wrote:

@JamesH65 https://github.com/jamesh65 If you would be so kind as to
provide a link to the new firmware, I can install it and try to capture
forensics again, as you'd requested.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-332548884,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADqrHbVhHD2rk_hp3kG51WBY0R0IQzL3ks5smmIbgaJpZM4HupC5
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

@JamesH65 I know I have regular freezing of the internal Wireless card of the RPI3 in AP mode so I'd be more than willing to help, but I'm not quite sure it's the mailbox issue or something else. I actually searched my logs for such message but I haven't found it.

I'm running the firmware shipped with kernel 4.4.50 (and I can't really upgrade to the latest 4.9 because of a regression, see #2197), would that version show that message or was this added at a later stage?

Thanks!

@iurly you said one thing right, the crash problem in the broadcom drivers occurs in AP mode, and I do not know if this is related to the mailbox error. it seems there is more than 1 bug here, maybe it is some hardware bug due to the time the problem exists and no solution was found.

What really bothers me is the lack of a workaround, short of rebooting the whole system.
I mean, there's not even a way to reset the peripheral and restart hostapd?!?

@iurly you said one thing right, the crash problem in the broadcom drivers occurs in AP mode, and I do not know if this is related to the mailbox error. it seems there is more than 1 bug here, maybe it is some hardware bug due to the time the problem exists and no solution was found.

Just FYI, I am having problems in client/station mode too. Running LEDE master, 4.9 kernel and using firmware 7.45.41.46.

@JamesH65
Understand the desire to keep the test firmware from being published. Email would be fine, but I don't want to post my address here publicly, either, and I don't see a way to send messages on github.

Use my pi address above to email me and I will send you the firmware.

Re. Ap mode
There have been a few fixes since 4.4, so worth trying the latest stretch
to see if that issue still occurs.

Ah, when you edit a comment no email update is sent, and I edited in my Pi email to the entry above, so you may not have been updated. Use github website to see where you need to email me.

@JamesH65 Sent you an email. Glad to hear that the previous forensics capture helped narrow it down, at least... It looks like many people will be pleased when this issue is resolved.

@JamesH65
Here is a forensics capture from the firmware you emailed: https://pastebin.com/zdB36ttj
Hope it helps.

Awesome, will pass on to Cypress. Thanks for doing that.

I have a pi in a setup right now where it seems I can reproduce this at will. If collecting more forensics is helpful let me know. The mailbox error is all I can see in logs.

After I replaced the microSD in my Zero W, it has been connected for 7 days without issue. I don't think it ever survived this long. Sounds weird that an SD card could influence WiFi, but as they are both connected to the SDIO bus it could be possible that one influences the other.

The card I used before was a (probably cheap) 8GB Transcend class 4 that came with my UDOO Quad board. Right now it's a Samsung EVO 32GB. People running into problems might want to try if using a different card helps.

@stintel Interesting, but maybe it was some problem setting the software on the other microSD, or even damaged microSD.

Could that be power related? Maybe the cheap card momentarily drew too much power from the bus?

I loaded up Pelwell's posted firmware and saw a HUGE improvement. Before, an SSH to my Pi 0W was like dialing into a terminal with a 2400 baud modem and a crappy phone line. Now, I can run remote X and it works great.

Thanks!

I have same problem. By transfering huge amount of files-names (sync-over-ftp) from raspberryPi3-internal-wifi to Galaxy S5 wifi stop working. but some times works...

I had the same problem of the mailbox message with my RPi3 WiFi AP, but I have found a solution in this forum, and it worked for me. The solution was to change the following params in /etc/hostapd/hostapd.conf

wpa=3 changed to wpa=2
auth_algs=3 changed to auth_algs=1

I've tested it for 1 week and it doesn't show mailbox issues anymore.

I'm not sure if it would work for all of you, but you could try it and post here if it works.

This is the working hostapd,conf:

interface=wlan0
driver=nl80211
country_code=CO
ctrl_interface=wlan0
ctrl_interface_group=0
ssid=Mailbox Issue Test
hw_mode=g
channel=5
wpa=2
wpa_passphrase=mailbox
wpa_key_mgmt=WPA-PSK
wpa_pairwise=TKIP
rsn_pairwise=CCMP
beacon_int=100
auth_algs=1
macaddr_acl=0
wmm_enabled=1
eap_reauth_period=360000000

Any update on this issue? Or is there any known workaround?

Still experiencing this on a recently bought Pi Zero W with the latest stretch-lite and rpi-update as of yesterday.

Using the RPi to stream a camera feed via RTSP (udp) I can see the the connection drastically worsen just before the WiFi connection cuts out, after that the WiFi connection never recovers and I have to power cycle the Pi0W.

A dmesg > dmesg.log only shows:

brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012

If I move the Pi0W closer to my access point the issue does not occur.

I'm not using the Pi0W as access point, it's just a client. I have tried different power sources.

We are currently waiting for Cypress, the providers of the wireless chip,
to progress the issue. I'll ping them, again.

On 25 October 2017 at 14:02, Matthias Urhahn notifications@github.com
wrote:

Any update on this issue? Or is there any known workaround?

Still experiencing this on a recently bought Pi Zero W with the latest
stretch-lite and rpi-update as of yesterday.

Using the RPi to stream a camera feed via RTSP (udp) I can see the the
connection drastically worsen just before the WiFi connection cuts out,
after that the WiFi connection never recovers and I have to power cycle the
Pi0W.

A dmesg > dmesg.log only shows:

brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012

If I move the Pi0W closer to my access point the issue does not occur.

I'm not using the Pi0W as access point, it's just a client. I have tried
different power sources.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-339322153,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADqrHRlPhJBGXc3JFWbpw_Tf4_EKmgAeks5svzFQgaJpZM4HupC5
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

Well... I've upgraded to the latest kernel/firmware (apt-get upgrade then rpi-update), and now, even my Pi3 which had a rock solid wifi is also losing it after a few hours!! I know, if it ain't broke, don't fix it... should not have upgraded, but since I do tests from time to time in my 2nd Pi3 with the same SD card..

FWIW, I can also reproduce this problem at will. I created a forum post at Raspberry Pi which explains the issue:

https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=196018&p=1226143#p1226143

NOTE: I am not using the Pi as an AP. I can help with forensics or testing an experimental firmware etc. if that helps.

Same problem here. I set up ownCloud and can transfer files from my laptop without any problem.
But as soon as I transfer files with my Samsung Galaxy S7 wifi breaks and
raspberrypi kernel: [ 962.273390] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012 :
appears.

My router is a FRITZ!Box 7490.

Thanks @srinathava for the post which describes my problem well!

Can the people who have tested with the test firmware please try the following - more debug information required by Cypress.

  1. when doing insmod, add "debug=0x100000"
  2. once the issue happens, save "dmesg" output

Thanks.

Another request for help on this one.

Can the people who have tested with the test firmware (see above) please try the following - more debug information required by Cypress.

when doing insmod, add "debug=0x100000"
once the issue happens, save "dmesg" output - this is the bit we are interested in.

Thanks.

@JamesH65 Just to let you know, I'm now attempting to collect the info, but the issue hasn't shown itself yet. I've only made a few small changes to the /etc/hostapd/hostapd.conf file, but those changes may have inadvertently worked around this issue. If the issue doesn't present itself within a few days, I'll revert those changes in an attempt to replicate the problem and collect the debug data.

Thanks for the help on this.

It would be interesting to see those changes you made to hostapd if, indeed that gets round the issue.

After 4 days of stability, I reverted my changes to the /etc/hostapd/hostapd.conf file, and after just a few hours, the issue recurred. Here is the output from dmesg:

[86340.811305] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
[86374.278317] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[86376.838299] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[86376.838314] brcmfmac: brcmf_cfg80211_get_station: GET STA INFO failed, -110
[86379.398310] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[86381.958740] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[86381.958754] brcmfmac: brcmf_cfg80211_get_station: GET STA INFO failed, -110
[86384.518337] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[86384.518353] brcmfmac: brcmf_cfg80211_get_tx_power: error (-110)
[86387.078328] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[86389.638353] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[86389.638366] brcmfmac: brcmf_cfg80211_get_station: GET STA INFO failed, -110

I'm running a software package called RaspAP, and I'm pretty sure it configured the hostapd.conf file on my behalf, though I'm not 100% sure.

Anyway, by commenting out this line in /etc/hostapd.conf :
wpa_pairwise=TKIP
and replacing it with this:
rsn_pairwise=CCMP
I had stable operation for 4 days straight, whereas it used to crash within a few hours, or sometimes even minutes!

Hope all this helps.

Apologies if I am posting in the wrong place, I am experiencing strange behavior with the RPi3 internal (broadcom) wlan when sending UDP Unicast packets under raspbian.
I send a small packet of data 2kb once a second, at the receiver end this is blocked every 120 seconds for about 3-4 seconds. This test runs like clockwork, and I can reproduce with iperf doing the following

Rpi3

iperf -u -c 192.168.1.22 -i 1 -t 3600

Ubuntu PC connected as WiFi client to RPi3 (IP 192.168.1.22 as above)

iperf -u -s -i 1

Guarenteed a blockage every 120 seconds. Interesting, this does not seem to occur using TCP
Finally, having downloaded and looked at the driver code (and not understanding anything) I noticed a suspicious mention of

define BRCMF_SCAN_PASSIVE_TIME 120

Which is then used in the driver code

could this be related, I am at my witts end trying to resolve ?
Thx

I put the following into /etc/rc.local and mine seems to be working much better:

Iwconfig wlan0 power off

PI zero w

Sean

On Dec 19, 2017, at 3:42 AM, LeeMooreImperas notifications@github.com wrote:

Apologies if I am posting in the wrong place, I am experiencing strange behavior with the RPi3 internal (broadcom) wlan when sending UDP Unicast packets under raspbian.
I send a small packet of data 2kb once a second, at the receiver end this is blocked every 120 seconds for about 3-4 seconds. This test runs like clockwork, and I can reproduce with iperf doing the following

Rpi3

iperf -u -c 192.168.1.22 -i 1 -t 3600

Ubuntu PC connected as WiFi client to RPi3 (IP 192.168.1.22 as above)

iperf -u -s -i 1

Guarenteed a blockage every 120 seconds. Interesting, this does not seem to occur using TCP
Finally, having downloaded and looked at the driver code (and not understanding anything) I noticed a suspicious mention of

define BRCMF_SCAN_PASSIVE_TIME 120

Which is then used in the driver code

could this be related, I am at my witts end trying to resolve ?
Thx


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

Hi Sean
Thanks for heads up, unfortunately, this is not accepted by the broadcom device, I get

Error for wireless request "Set Power Management" (8B2C) 
    Set failed on device wlan0; Invalid argument

However I DO use the following command in my setup to achieve the same goal
$ iw dev wlan0 set power_save off
this is accepted and if I inquire settings
$ iwconfig wlan0
I see
Power Management:off

So pretty sure power saving is off, but does not resolve this issue
Thx

@LeeMooreImperas I suggest to open a separate issue for this and provide at least the Kernel version and Wifi firmware version.

I commented on this thread a long time ago but then had to stop looking at it because I was no longer able to reproduce it. Well, I have some new data and I find this very interesting.

I have two Raspberry Pi's; one B+ V1.2 and one original Raspberry PI (C) 2011.

If I run "4.1.19+ #858 Tue Mar 15 15:52:03 GMT 2016" on the RaspPi B+, the Edimax WiFi chip will exhibit the problem others have seen.

If I run "4.9.27+ #1 Thu May 11 17:40:53 UTC 2017" on the same RaspPi B+, the same Edimax WiFi chip will not show the problem.

I'm now wondering if it's more of an incompatibility with the hardware and I'm also reminded that with much older RaspPi boards the USB WiFi needed a special cable in order to augment the +5V power because the power coming from the board wasn't enough to drive it. I'm going to switch the SD cards back such that they exhibit the problem, then see if this type of cable helps.

Ok,I think I had that incorrect.

Running 4.9.27+ on the older RaspPi will exhibit the problem. Verifying now.

Ok, this is definitive and very interesting.

Using an original Raspberry Pi board (circa 2011) and running Linux 4.9.27+ (from "uname -a"), I can reproduce the problem of the Edimax USB WiFi chip loosing the WiFi connection, and thus the IP address, every time, within a few minutes.

Using the same original Raspberry Pi board and the same version of Linux, but the ONLY change of simply using a USB cable which allows me to augment the +5V to the USB WiFi from a secondary source, the system is stable.

So, there definitely seems to be a problem with the Edimax USB WiFi card not getting enough power in this setup. This, obviously, doesn't help those who are using a Raspberry Pi with built-in WiFi, but in those cases I'm wondering if maybe a similar problem is happening and if perhaps moving to a USB adapter which produces more amps may show a difference?

It's possible that the Mains to USB adapter that powers the Pi might not be giving a clean 5V in some cases.
Since the AC has to be regulated then smoothed out before it becomes 5V, but you still get a bit of ripple in the outputed DC.
The 5V from a laptop or PC is more likley to be ripple free than a cheap Mains to USB charger

It'd be interesting to put a oscilloscope on the power supply to the wifi chip under different conditions to see what the ripple is like during failure / non-failure

Please keep this issue to problems with the ONBOARD Brcm wireless chip on the Pi3 - if you have issues with other devices, please use the forum for advice. This is simply so information we need to pass on to Cypress does not get too confused.

@JamesH65
@lategoodbye

Hi James, Stefan,
So conflicting messages here, the issue I logged was directly related to RPi3 BRCM WiFi.
So should this go in a different thread (as suggested by lategoodbye) ?
I would have thhought that this thread is specifically about my issue ?

I am happy to move the issue

Thx

@LeeMooreImperas Although your issue is with the onboard wireless, your's is a pause every 2 minutes, this issue describes a complete wireless lockup failure that happens at random intervals, so it feels like it isn't related. So it might be worth creating another issue. I was unfortunately a bit vague in my previous message.

Adding another "me too" on this.
Hardware: Raspberyy Pi 3, Model B.
Kernel: Linux raspberrypi 4.9.70-v7+
OS: Raspbian GNU/Linux 9 (stretch)
Loaded Image: 2017-11-29-raspbian-stretch.img
Image MD5:
SDCard: Not sure on manufacturer, it came with the kit
Interfaces File: interfaces.txt
hostapd.conf: hostapd.txt
dmesg output (while working): dmesg_20171230.txt

The device is configured as an access point for my wireless network. My primary router is a Linksys EA6400 firmware version 1.1.40 (build 184085). Both the Linksys and the Pi are offering the same SSID on different channels. Pi is connected to the router via a wired connection with an unmanaged switch in between.
The OS load on the device is fairly fresh. I had a RetroPie image on the system and faced the same issues. I reloaded over to Raspbian to see if it worked any better.
I am seeing sporadic dropouts of the bridge. The primary symptom is that the wireless network provided by the Pi appears to become isolated from the wired network. The wired interface continues to function normally and I can reach the Pi via SSH. If I run a tcpdump on the wireless interface (wlan0), while in that state, I can still see traffic to and from connected devices.
Cycling the wireless connection (ifdown; ifup) does not seem to fix the issue. I have not yet tried cycling the bridge interface (br0) yet. In general I have been rebooting the device which fixes the issue.
I am not sure it's related; however, the issue seems to appear when I am trying to control my ChromeCast 2 after it has been running for a while. For example, if I am playing a show via Netflix on the ChromeCast and then try to pause the show, the bridge seems to drop out at that time. I haven't been able to catch this via tcpdump yet; but, that is a next step for me.
I have considered that it might be a heat issue; however, I had /opt/vc/bin/vcgencmd measure_temp running on a 30 second loop during one of the dropouts and my cpu temperature was in the 50C range. Not sure how to get a temperature reading on the LAN chip, as that may be where a heat issue is arising.

I'm happy to capture logs/pcaps as needed to help troubleshoot the issue further. Though, please be explicit in instructions as I have quite a few gaps in my knowledge on Linux.

EDIT: just had a drop out and did a sudo ifdown br0 && sudo ifup br0 and it seems to have started working again. I'll test again at my next drop out.

EDIT2: Here is a dmesg dump with the connection failed. The sudo ifdown br0 && sudo ifup br0 did not seem to recover the connection this time.
dmesg_20171220_failed.txt
Of particular note seems to be the error:
brcmfmac: brcmf_cfg80211_stop_ap: setting INFRA mode failed -7

EDIT3: Ran across this thread about a similar issue which referred back to this thread. Ran the requested change to the brcmfmac module to enable debugging. Had the failure trigger and captured the dmesg output:
dmesg_debug_failed.txt
Also, I noticed mention of a Samsung phone in the other thread as well. We've noticed that the bridge troubles with my Pi do seem to revolve around my Samsung Galaxy S7. My wife's Apple devices (iPhone and iPad) do not seem to trigger the issue.

EDIT4: Ran a sudo rmmod brcmfmac && sudo modprobe brcmfmac debug=0x100000 Followed by dmesg again. Output below:
dmesg_debug_failed_reset_driver.txt

Hmm, not the expected mailbox error. I'll pass it on to the Cypress developers in the new year.

Not sure if this is the same problem, but my symptom is onboard RPi3 wireless intermittent; 10 seconds of good pings, followed by 20-30 seconds of no pings, and repeat forever. When no pings, remote host does receive the ICMP echo requests and sends the ICMP echo replies. Access point returns ICMP host unreachable to the remote host.

Precondition are both ethernet and wireless connected. Chance of it happening improved greatly by unnecessarily restarting dhcpcd.

Workaround is to set the network interface to promiscuous mode; sudo ifconfig wlan0 promisc. Symptom returns within ten seconds to a minute of sudo ifconfig wlan0 -promisc.

Further information available if needed, just ask.

@Sylver-Dragon, for me a tcpdump prevented the symptom, and perhaps you found the same; try -p flag, which turns off promiscuous mode; it let the symptom continue.

https://github.com/iiab/iiab/issues/638

@quozl I've tried running tcpdump on both the wireless interface and the bridge interface and had lockups while it was running. I'll give the promiscuous mode a shot and see if it makes a difference. Though, based on the debug output of the wireless interface driver, specifically:
wl0: _wlc_bss_update_beacon: out of mem, 0 bytes malloced
I'm guessing this is some sort of resource (memory?) leak on the part of the driver. When I have a bit more time, I want to do a packet capture and dig into the moment it locks up. I suspect my phone is sending some sort of either odd or malformed packet or series of packets at the device which is triggering the lockup. If I can capture and isolate that, it should help inform the fix.

Looks like a different fault to the mailbox issue we are currently tracking. Which is annoying. Is your phone a Samsung BTW? The mailbox issue seems to be triggered more often by SS devices. If you can track down what ever causes the issues, that would be very useful

I'm hunting the same (?) for weeks now. I feel like I must have read every report about this and similar issues. So here some more info from me:

I use the internal wifi of an raspberry pi 3 as access point. I use the standard raspbian kernel and modules (Linux version 4.9.35-v7+ (dc4@dc4-XPS13-9333) (gcc version 4.9.3 (crosstool-NG crosstool-ng-1.22.0-88-g8460611) ) #1014 SMP Fri Jun 30 14:47:43 BST 2017).

Wifi Firmware is: brcmfmac: Firmware version = wl0: Aug 7 2017 00:46:29 version 7.45.41.46 (r666254 CY) FWID 01-f8a78378

I am pretty certain that this hardware setup used to work, but after some update (also of the kernel) I believe, things went south. Creating the AP works fine, but after using it for some time (30 min or so, not the same every time I think), streaming using a Chromecast, the connection stops working. It could be (but here I am not sure) that this happens most often when I pause/stop the stream, but only rarely in the middle of watching. When it fails, existing connections are dropped and new connection attempts are not accepted by any client. Reloading hostapd results in brcmf_cfg80211_stop_ap: setting INFRA mode failed -7 (cannot set mode to master). This can be fixed temporarily by reloading the driver: rmmod brcmfmac; modprobe brcmfmac. Things then work as expected again until it fails the next time. Alternatively, a reboot "fixes" the problem as well.

The only thing I get in the failed state (with enabled debug) in syslog is:

kernel: [ 3615.491795] brcmfmac: brcmf_netdev_wait_pend8021x: Timed out waiting for no pending 802.1x packets
hostapd: wlan0: STA xx:xx:xx:xx:xx:xx IEEE 802.11: deauthenticated due to local deauth request

That error message doesn't make sense to me. It is timing out while waiting 'for no pending packets'? Anyway:

I have power save off:

iw wlan0 get power_save Power save: off

roam_off is set to 1 and debug is enabled:

`systool -a -v -m brcmfmac
Module = "brcmfmac"

Attributes:
coresize = "222874"
initsize = "0"
initstate = "live"
refcnt = "0"
srcversion = "10E8F4629D109E78E1F506C"
taint = ""
uevent =

Parameters:
alternative_fw_path =
debug = "1048576"
roamoff = "1"
`

I don't have a Samsung phone, but some Android ones. None of these are connected to that access point. The only direct clients are two Chromecasts (one video, one audio-only, plus an Android Tablet). Everything else comes in via the wired interface.

@knarrff
Please search this page for my previous comment from 3 weeks ago for a good workaround.

@JamesH65
Never got an ack from you. Did you copy/relay the dmesg output that I shared from that comment 3 weeks ago to the Cypress guys?

@randyoo: I do have both "rsn_pairwise=CCMP" and "wpa=2" in my hostapd.conf. Doesn't help in my case. Non-secret entires from my file:
`
interface=wlan0

driver=nl80211
ssid=XXX
hw_mode=g
channel=1
ieee80211n=0
wmm_enabled=1
ht_capab=[HT40][SHORT-GI-20][DSSS_CCK-40]
macaddr_acl=0
auth_algs=1
ignore_broadcast_ssid=0
wpa=2
wpa_key_mgmt=WPA-PSK
wpa_passphrase=XXX
rsn_pairwise=CCMP
`

It also becomes clear that the failure always seems to happen for me when I try to pause a netflix stream to the Chromecast (which doesn't mean it always fails when I try this, just that whenever it fails, that was what I was doing). On the other hand, this might be a red herring, as this is what I do almost all of the time with that wifi network. It could just be that the problem occurs simply when a device tries to authenticate to the AP (like the android tablet that likely disabled wifi while sleeping). More testing will show. I'll try without Chromecast - just regular wifi on the tablet, including wifi-sleep cycles.

Doesn't look like my problem is the same as this issue, so I'll switch to lurk mode. My ifconfig wlan0 promisc did fix it for @holta (https://github.com/iiab/iiab/issues/638), but without it helping anyone else we must be looking at a different problem.

I can reliably reproduce this without Netflix or Chromecast by connecting to the network via Google tablet, then let the tablet go to sleep, resume (the tablet tries to reassociate), and at that moment the AP is "dead".

On a Linux machine, I get these in syslog when trying to associate (using the correct credentials):

`

[42231.476518] wlan7: send auth to b8:27:eb:33:98:14 (try 1/3)
[42231.583434] wlan7: send auth to b8:27:eb:33:98:14 (try 2/3)
[42231.694397] wlan7: send auth to b8:27:eb:33:98:14 (try 3/3)
[42231.799368] wlan7: authentication with b8:27:eb:33:98:14 timed out
[42236.585750] wlan7: authenticate with b8:27:eb:33:98:14
[42236.598833] wlan7: send auth to b8:27:eb:33:98:14 (try 1/3)
[42236.602344] wlan7: authenticated
[42236.603480] wlan7: associate with b8:27:eb:33:98:14 (try 1/3)
[42236.619322] wlan7: RX AssocResp from b8:27:eb:33:98:14 (capab=0x411 status=0 aid=1)
[42236.623181] wlan7: associated
[42236.623325] IPv6: ADDRCONF(NETDEV_CHANGE): wlan7: link becomes ready
[42236.625464] wlan7: Limiting TX power to 30 (30 - 0) dBm as advertised by b8:27:eb:33:98:14
[42239.730365] wlan7: deauthenticated from b8:27:eb:33:98:14 (Reason: 2=PREV_AUTH_NOT_VALID)
[42241.243434] wlan7: authenticate with b8:27:eb:33:98:14
[42241.256326] wlan7: send auth to b8:27:eb:33:98:14 (try 1/3)
[42241.260724] wlan7: authenticated
[42241.263403] wlan7: associate with b8:27:eb:33:98:14 (try 1/3)
[42241.279537] wlan7: RX AssocResp from b8:27:eb:33:98:14 (capab=0x411 status=0 aid=1)
[42241.282500] wlan7: associated
[42241.336166] wlan7: Limiting TX power to 30 (30 - 0) dBm as advertised by b8:27:eb:33:98:14
[42244.392213] wlan7: deauthenticated from b8:27:eb:33:98:14 (Reason: 2=PREV_AUTH_NOT_VALID)
[42253.916626] wlan7: authenticate with b8:27:eb:33:98:14
[42253.928966] wlan7: send auth to b8:27:eb:33:98:14 (try 1/3)
[42253.936020] wlan7: authenticated
[42253.939533] wlan7: associate with b8:27:eb:33:98:14 (try 1/3)
[42253.943361] wlan7: RX AssocResp from b8:27:eb:33:98:14 (capab=0x411 status=0 aid=2)
[42253.945415] wlan7: associated
[42254.035149] wlan7: Limiting TX power to 30 (30 - 0) dBm as advertised by b8:27:eb:33:98:14
[42257.053762] wlan7: deauthenticated from b8:27:eb:33:98:14 (Reason: 2=PREV_AUTH_NOT_VALID)
`

b8:27:eb:33:98:14 is the RPI3 in question, on which I again get the dmesg-entries:
brcmfmac: brcmf_netdev_wait_pend8021x: Timed out waiting for no pending 802.1x packets

I don't quite understand why the AP is sending PREV_AUTH_NOT_VALID while I am apparently associated. I am under the impression that authentication comes before association. There shouldn't be a case where I am associated but not authenticated.

Hi

I'm using a Pi3 as a media server, comms are through the on-board WiFi

Rasbian Stretch Lite 4.9 upgrade update (now)
Plex Media Server

I'm getting...

kernel: [ 1958.899715] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012

in dmesg and syslog when connecting to the Pi using BubbleUPnp client on a Samsung S5 SM_G900F Android 7.1.2 this is pretty much guaranteed and requires a reboot for the PiWiFi to become usable again.

On my old Sony Xperia XP Android 6.0.1 again running BubbleUPnp it works fine so far. This is my solution. However, if I can be of any assistance getting to the bottom of this then I'll be pleased to contribute.

John

It also works on the iPad running mConnectLite

@johnthesoftwareathome Please write an email to James Hughes from Raspberry Pi so he can send you a Wifi debug firmware.

Email address posted through Raspberry Pi contact page fao James Hughes

OK, we have a new debug firmware from Cypress that I would people to test with - this has more debug in it, but not fixes, so only for those happy to test. If you have already sent me your email address, indicate here that you would like to do some testing and I will send out the firmware, or contact me via PM on the Pi forums.

To save people digging around for how to install/run the new firmware.

Copy the debug firmware file to :

/lib/firmware/brcm/

(You will want to back up the original first)

I think you need to reboot at this stage.

Now restart the Linux driver in debug mode

sudo rmmod brcmfmac && sudo modprobe brcmfmac debug=0x100000

Make it go wrong..!!

Dump dmesg to file and post here.

To add to what James says, you may prefer to avoid the rmmod/modprobe sequence by adding brcmfmac.debug=0x100000 to /boot/cmdline.txt.

@JamesH65 I'd be happy to help test. Though having just now registered over at the Pi Forum, I'm not able to send messages. Using the same username over there, if that helps.

I've tried the new debug firmware yesterday and also added brcmfmac.debug=0x100000 to /boot/cmdline.txt.

However, strangely I didn't see any debug output in dmesg. Even more strangely, where I could reliably reproduce the problem before, it worked all evening regardless of what I did. I did not have a single problem, and all I did differently was using the new firmware file (md5 sum ba679a85c1dc76e9775603af45440bc0) instead of the old and adding the entry to /boot/cmdline.txt instead of adding the option using modprobe. I didn't have time yesterday to go back to the old firmware to see if this reverts to the old problems. I'll report back once I did. In the meantime: is all that changed in that firmware really "more debug"?

I thought it was just debug, but will go back to Cypress as clearly
something else has changed, Hopefully in a good way!

On 11 January 2018 at 06:48, Frank Löffler notifications@github.com wrote:

I've tried the new debug firmware yesterday and also added
brcmfmac.debug=0x100000 to /boot/cmdline.txt.

However, strangely I didn't see any debug output in dmesg. Even more
strangely, where I could reliably reproduce the problem before, it worked
all evening regardless of what I did. I did not have a single problem, and
all I did was using the new firmware file (md5 sum
ba679a85c1dc76e9775603af45440bc0). I didn't have time yesterday to go
back to the old firmware to see if this reverts to the old problems. I'll
report back once I did. In the meantime: is all that changed in that
firmware really "more debug"?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-356842102,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADqrHam4jUgDCkSFxMXS-KW4axCLoPZhks5tJa6fgaJpZM4HupC5
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

My experience was similar to @knarrf's except that I did see debug messages in dmesg.

Previously my Samsung S5 was unusable as a plexserver client, but when I loaded the debug firmware it worked (with, as I said debug messages in dmesg) so I reverted to my original binary (backed up and size checked) and it still works. So I'm now running again with the debug firmware (I haven't tried the cmdline.txt mod, just the rmmod/modprobe) and have listened to some hours of music with no errors. I've tried activating most of the many WiFi devices I have scattered around, to no effect.

I'll try this for a few days to see if anything happens then reload the original and try again. I may not have powered off the Pi between reboots. I'll make sure I do this when I revert back to see if it might be some sort of register retention.

Tonight I uploaded older firmware (taken from a raspian installation image; more info on the versions I use below), and reloaded the module with that (and debug enabled), I even rebooted in between. The short output in dmesg confirms that the old version is now loaded. And as with @johnthesoftwareathome, it continued to work all evening long, despite doing stuff that would have had taken down the wifi quite a few times in the past.

So, for now my task seems to be to get it back to "not working" to have a chance to find out what is going on. My next try, although not today anymore, will be doing a hard reset (removing power for some time instead of just using the 'reboot' command), and using a completely new installation from a fresh image.

Also, I sadly cannot exclude the possibility that the image I got failures with was yet a different version, as I forgot to make a backup before overwriting it with the debug image. Maybe @johnthesoftwareathome could post which exact image he is using and had/has problems with? On the other hand, I back then only updated the firmware using the standard packages, and I do have package version firmware-brcm80211 (1:0.43+rpi6) installed. Although the last entry in the changelog doesn't specify the firmware version, the second to last does: 7.45.41.26, which is older than the one from the image. Assuming the changelog was written correctly, that would be a strong indication that the firmware wasn't replaced since the image was created, and that the one I call 'image' is the one I used before.

Information about my two firmware files (image: the one from the raspbian installation image, debug: the one I received directly from @JamesH65:

debug:
Firmware version = wl0: Oct 23 2017 03:55:53 version 7.45.98.38 (r674442 CY) FWID 01-e58d219f
md5sum: ba679a85c1dc76e9775603af45440bc0
image:
Firmware version = wl0: Aug 7 2017 00:46:29 version 7.45.41.46 (r666254 CY) FWID 01-f8a78378
md5sum: 5f520a38ab4e943bfa1ba102f80fb2a0

@johnthesoftwareathome : what does the new "debug" output look like? I still don't get anything that even remotely looks like extensive debug, regardless of how I load the module. I get zero entries while operation, and even after boot all the looks somewhat relevant is:

as root: dmesg | grep brcm
[ 0.000000] Kernel command line: 8250.nr_uarts=0 bcm2708_fb.fbwidth=640 bcm2708_fb.fbheight=480 bcm2708_fb.fbdepth=16 bcm2708_fb.fbswap=1 vc_mem.mem_base=0x3ea00000 vc_mem.mem_size=0x3f000000 dwc_otg.lpm_enable=0 console=ttyS0,115200 console=tty1 root=PARTUUID=f8e4f7c2-02 rootfstype=ext4 elevator=deadline fsck.repair=yes rootwait brcmfmac.debug=0x100000
[ 3.500135] usbcore: registered new interface driver brcmfmac
[ 3.662113] brcmfmac: Firmware version = wl0: Aug 7 2017 00:46:29 version 7.45.41.46 (r666254 CY) FWID 01-f8a78378
[ 3.774278] brcmfmac: power management disabled
[ 4.711443] brcmfmac: power management disabled

Small update: looking back at one of my older comments in this thread I can actually confirm that the old firmware I used today ('image') is the one that I had trouble with just up until I tried the newer debug image.

Empty house, so finally got round to listening to Bowie's last album. Everything worked perfectly (the album not so). Away from home until tomorow, I'll pick this up then.

Managed to get the original firmware to fail as before but not reliably between using it and the debug firmware. Currently just rebooting with the debug stuff with no failures as yet.

I misunderstood what @knarrf meant about the debug output and assumed that he couldn't see that the new firmware had installed rather than meaning he expected some sort of debug stream (which I can't see either). He has a point. In the event of this failing will we see anything or is the debug hex wrong?

Also one of the failures didn't crap out immediately. It allowed me to ssh back in before needing a reboot. Syslog contains the following..

Jan 13 08:34:48 plexServer kernel: [ 46.648630] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
Jan 13 08:35:14 plexServer kernel: [ 72.161473] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
Jan 13 08:35:14 plexServer kernel: [ 72.161484] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-110)

That's a very familiar set of error messages, but it's still useful to know that you were experiencing the same issue which now looks like it may be fixed.

Cypress are preparing a new firmware release for us - we'll post here when something is available for testing. Thanks everyone for your interest, time and patience.

Ok. Thanks for a working driver.

Things may have moved on since this..

https://tech4research.wordpress.com/2014/07/23/brcmfmac-debugging-and-appropriate-debug-values/

and I appreciate that the debug switch for the new firmware may be a special addition but these switches appear to work for both the original and 'debug' firmware' and the expected stream of debug is spewed forth.

Probably already been seen; but, TPLink is claiming that android devices are DoS'ing their devices with MDNS packets when they wake up from sleep and try to reconnect with Chromecast or similar devices.
Digging into a pcap I got of a disconnect one my own device, I can see ~3,500 MDNS packets come in over ~2.25 seconds right before my connection dies. It seems to fit this pattern and may be related.

Just to add/confirm some info in this issue:

  • Setting the wifi interface to promiscuous (ifconfig wlan0 promisc) seems to mitigate the issue
  • The problem seems to be caused only by my Android 7.1.2 Galaxy S7 phone (which I got a week ago and this is when the problems started)

I am running Debian Buster with aarch64 on my Pi3 and run a Nextcloud server on it. scp'ing larger files from a Linux laptop does not cause any problems nor does Nextcloud sync from that laptop, but as soon as I upload a batch of files from the Galaxy the error Unknown mailbox data content: 0x40012 will pop up and Wifi connectivity is lost.

The brcmfmac firmware I am using is 7.45.41.26 (r640327) FWID 01-4527cfab

Unfortunately I do not have an older Android to test.

I tcpdumped an upload from the Samsung to the Pi3, but then the Wifi was in promiscuous mode and everything worked just fine. If I find the time I will have a look at the pcap and report back if I find anything useful/interesting.

P.S.: Cast (the main offender described in the TPLink article) is not active (or at least I can't see it in the connectivity settings).

Hi everybody,

I just want to confirm that switching off the powersafe mode and enabling promiscuis mode fixed the issue for me: for the first time it managed to stay connected for 24 hours.

sudo iw wlan0 set power_save off
sudo ifconfig wlan0 promisc

Thanks,
Luc

Please see this forum post for details on a new firmware release. Anyone seeing the mailbox issue, or indeed any wireless problems, should try this out to see if it helps.

https://www.raspberrypi.org/forums/viewtopic.php?f=63&t=203508

@JamesH65
Hi James,
Can you provide some installation instructions, is the .bin file in the archive a self installing executable ?
Thx
Lee

Instructions now on the linked forum page.

Checking back in after running the new firmware for a bit over a week. So far, it's been solid. I have repeatedly woken my Samsung device up after long periods and the wireless interface on my Pi has kept running. I believe that I had one instance where it dropped temporarily and then recovered; though, I have not been able to reproduce that. All in all, it looks solid. Thank you to both James for sticking with this and to the Cypress team for getting this one fixed.

Thanks for the report.

Can somebody tell me if the firmware fix has already made it in the official Raspbian distribution so that it can be installed via apt update or if not yet, inform me after it has been the case?

Can somebody tell me if the firmware fix has already made it in the official Raspbian distribution so that it can be installed via apt update or if not yet, inform me after it has been the case?

Yes. https://www.raspberrypi.org/forums/viewtopic.php?f=63&t=203508&start=25#p1270212
Some issues being reported on Pi0W after generally updating, but it's not totally clear if it is just the firmware change or something else - https://www.raspberrypi.org/forums/viewtopic.php?f=63&t=204882

I have updated the firmware

$ md5sum /lib/firmware/brcm/brcmfmac43430-sdio.bin
ba679a85c1dc76e9775603af45440bc0  /lib/firmware/brcm/brcmfmac43430-sdio.bin

but have still the same issue

$ dmesg | grep brcmfmac
[    3.917447] usbcore: registered new interface driver brcmfmac
[    4.079889] brcmfmac: Firmware version = wl0: Oct 23 2017 03:55:53 version 7.45.98.38 (r674442 CY) FWID 01-e58d219f
[    5.079252] brcmfmac: power management disabled
[   27.125197] brcmfmac: power management disabled
[   92.278751] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
[  338.327158] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[  340.887163] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[  340.887181] brcmfmac: brcmf_cfg80211_get_station: GET STA INFO failed, -110
[  360.407241] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[  362.967295] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[  362.967308] brcmfmac: brcmf_cfg80211_get_station: GET STA INFO failed, -110

The following is also not avoiding this issue

sudo iw wlan0 set power_save off
sudo ifconfig wlan0 promisc

I am using the RPi3 as an access point with hostapd and dnsmasq.
I can always reproduce the issue when starting a download in the Spotify app on my Android phone.

Do I need to update the following file, too?

$ md5sum /lib/firmware/brcm/brcmfmac43430-sdio.txt
9a88b55134d9f8f3ad2331b93f4b7b79  /lib/firmware/brcm/brcmfmac43430-sdio.txt

Will it be used by the driver or can it be ignored?

Edit:
Yes. The brcmfmac43430-sdio.txt is also required.
But I am using the latest greatest versions from https://github.com/RPi-Distro/firmware-nonfree/tree/927fa8ebdf5bcfb90944465b40ec4981e01d6015/brcm

I have also updated my 4.9.35-v7+ kernel to 4.14.18-v7+.
But the issue still exists.

Run into the same problem on my RPi3: Wifi get's dropped after some uptime (e.g. over night) with almost no traffic.
dmesg output only shows:

[ +3,519999] brcmfmac: brcmf_do_escan: error (-110)
[ +0,000011] brcmfmac: brcmf_cfg80211_scan: scan error (-110)
[  +3,519987] brcmfmac: brcmf_do_escan: error (-110)
[  +0,000012] brcmfmac: brcmf_cfg80211_scan: scan error (-110)

I tried reloading the driver (rmmod & modprobe brcmfmac):

[  +0,100025] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -5
[  +0,000014] brcmfmac: brcmf_cfg80211_get_tx_power: error (-5)
[  +0,519934] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing to do.
[  +0,000050] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing to do.
[  +0,000672] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing to do.
[  +0,000012] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-5)
[  +0,221254] usbcore: deregistering interface driver brcmfmac
[Mär12 21:18] brcmfmac: F1 signature read @0x18000000=0x1541a9a6
[  +0,010071] brcmfmac: brcmf_fw_map_chip_to_name: using brcm/brcmfmac43430-sdio.bin for chip 0x00a9a6(43430) rev 0x000001
[  +0,000285] usbcore: registered new interface driver brcmfmac
[  +2,649115] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[  +0,005807] brcmfmac: brcmf_c_get_clm_name: retrieving revision info failed (-110)
[  +0,000010] brcmfmac: brcmf_c_process_clm_blob: get CLM blob file name failed (-110)
[  +0,000008] brcmfmac: brcmf_c_preinit_dcmds: download CLM blob file failed, -110
[  +0,000007] brcmfmac: brcmf_bus_started: failed: -110
[  +0,000021] brcmfmac: brcmf_sdio_firmware_callback: dongle is not responding

That somehow didn't work - driver loaded but no I got no interface
Tried again:

[Mär12 21:26] usbcore: deregistering interface driver brcmfmac
[ +32,681743] brcmfmac: F1 signature read @0x18000000=0x1541a9a6
[  +0,007275] brcmfmac: brcmf_fw_map_chip_to_name: using brcm/brcmfmac43430-sdio.bin for chip 0x00a9a6(43430) rev 0x000001
[  +0,000257] usbcore: registered new interface driver brcmfmac
[  +0,116144] brcmfmac: brcmf_c_preinit_dcmds: Firmware version = wl0: Aug  7 2017 00:46:29 version 7.45.41.46 (r666254 CY) FWID 01-f8a78378
[  +0,000641] brcmfmac: brcmf_c_preinit_dcmds: CLM version = API: 12.2 Data: 7.11.15 Compiler: 1.24.2 ClmImport: 1.24.1 Creation: 2014-05-26 10:53:55 Inc Data: 9.10.41 Inc Compiler: 1.29.4 Inc ClmImport: 1.36.3 Creation: 2017-08-07 00:37:47
[  +0,184532] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[  +0,000034] brcmfmac: power management disabled
[  +1,833812] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready

..and I'm up again.

Pi3 runs kernel '4.14.24-v7+ #1097' - firmware is the older one from Aug 7 2017 - same firmware blob which works flawlessly (uptime >2 month) on a Pi Zero W running kernel '4.9.77+ #1081'
Both Pis are connected to the same router & a room apart. Both are connected via WiFi only.

Probably worth using the latest firmware on 4.14, since 4.14 has all the required changes to work with that firmware.

:) Updated to latest fw (Oct 23 2017 03:55:53 version 7.45.98.38) yesterday after posting - seems to work atm - let's see what happens..

It appears that raspbian reverted back to the August 2017 firmware package. Are there any new requirements for the rpi 3B+ wireless?

Latest Foundation's stretch repo firmware-brcm80211 1:20161130-3+rpt3 package has Oct 23 2017 firmware version 7.45.98.38 for Pi3/Pi0W and other proper package for Pi3+

I've also got that problem with wifi dying.

Mar 17 18:25:28 hassass kernel: [10279.186321] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
Mar 17 18:25:30 hassass kernel: [10281.665090] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
Mar 17 18:25:30 hassass kernel: [10281.665622] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
Mar 17 18:25:30 hassass kernel: [10281.665638] brcmfmac: brcmf_run_escan: error (-110)
Mar 17 18:25:30 hassass kernel: [10281.665647] brcmfmac: brcmf_cfg80211_scan: scan error (-110)
Mar 17 18:26:30 hassass kernel: [10341.665866] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout

This is with 4.14.27-v7+ and with
/sbin/iw dev wlan0 set power_save off
/sbin/ifconfig wlan0 promisc
in /etc/rc.local.

same error messages as @flok99 - using latest firmware (rpi-update) on stretch.

OK, so the bug that we thought Cypress had fixed is still there. Back to
Cypress it goes. Took a year to get this version. Holding breath not
recommended.

Must confirm the version though, please post contents of

dmesg | grep brcmfmac

On 18 March 2018 at 01:44, Rebroad notifications@github.com wrote:

same error messages as @flok99 https://github.com/flok99 - using latest
firmware (rpi-update) on stretch.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-373966343,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADqrHY1Cmntz_kn9pvrZdgy32mTignlmks5tfbvwgaJpZM4HupC5
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

[ 4.112717] brcmfmac: F1 signature read @0x18000000=0x15264345
[ 4.119827] brcmfmac: brcmf_fw_map_chip_to_name: using
brcm/brcmfmac43455-sdio.bin for chip 0x004345(17221) rev 0x000006
[ 4.120314] usbcore: registered new interface driver brcmfmac
[ 4.440371] brcmfmac: brcmf_c_preinit_dcmds: Firmware version = wl0: Feb
27 2018 03:15:32 version 7.45.154 (r684107 CY) FWID 01-4fbe0b04
[ 4.440958] brcmfmac: brcmf_c_preinit_dcmds: CLM version = API: 12.2
Data: 9.10.105 Compiler: 1.29.4 ClmImport: 1.36.3 Creation: 2018-03-09
18:56:28
[ 10.911757] brcmfmac: power management disabled
[ 12.016088] brcmfmac: power management disabled
[ 2074.090674] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg
failed w/status -5
[ 2074.090687] brcmfmac: brcmf_cfg80211_get_tx_power: error (-5)
[ 2074.090745] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing
to do.
[ 2074.090753] brcmfmac: brcmf_link_down: WLC_DISASSOC failed (-5)
[ 2074.610583] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing
to do.
[ 2074.611992] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing
to do.
[ 2074.613945] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing
to do.
[ 2074.613971] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-5)
[ 2074.729716] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing
to do.
[ 2074.729733] brcmfmac: brcmf_cfg80211_reg_notifier: Country code iovar
returned err = -5
[ 2074.871693] usbcore: deregistering interface driver brcmfmac
[ 2074.929084] brcmfmac: F1 signature read @0x18000000=0x15264345
[ 2074.936897] brcmfmac: brcmf_fw_map_chip_to_name: using
brcm/brcmfmac43455-sdio.bin for chip 0x004345(17221) rev 0x000006
[ 2074.937139] usbcore: registered new interface driver brcmfmac
[ 2075.118180] brcmfmac: brcmf_c_preinit_dcmds: Firmware version = wl0: Feb
27 2018 03:15:32 version 7.45.154 (r684107 CY) FWID 01-4fbe0b04
[ 2075.118706] brcmfmac: brcmf_c_preinit_dcmds: CLM version = API: 12.2
Data: 9.10.105 Compiler: 1.29.4 ClmImport: 1.36.3 Creation: 2018-03-09
18:56:28
[ 2075.215365] brcmfmac: power management disabled
[ 2075.263751] brcmfmac: power management disabled
[ 2085.475001] brcmfmac: power management disabled
[ 2124.380808] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing
to do.
[ 2124.381146] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing
to do.
[ 2124.381156] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-5)
[ 2124.622345] usbcore: deregistering interface driver brcmfmac
[ 2124.705432] brcmfmac: F1 signature read @0x18000000=0x15264345
[ 2124.714194] brcmfmac: brcmf_fw_map_chip_to_name: using
brcm/brcmfmac43455-sdio.bin for chip 0x004345(17221) rev 0x000006
[ 2124.716213] usbcore: registered new interface driver brcmfmac
[ 2124.929556] brcmfmac: brcmf_c_preinit_dcmds: Firmware version = wl0: Feb
27 2018 03:15:32 version 7.45.154 (r684107 CY) FWID 01-4fbe0b04
[ 2124.929993] brcmfmac: brcmf_c_preinit_dcmds: CLM version = API: 12.2
Data: 9.10.105 Compiler: 1.29.4 ClmImport: 1.36.3 Creation: 2018-03-09
18:56:28
[ 2125.105218] brcmfmac: power management disabled
[ 2125.150290] brcmfmac: power management disabled
[ 8237.434034] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content:
0x40012
[ 8239.890302] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[ 8239.890822] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
[ 8239.890835] brcmfmac: brcmf_run_escan: error (-110)
[ 8239.890845] brcmfmac: brcmf_cfg80211_scan: scan error (-110)
[ 8254.280425] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg
failed w/status -5
[ 8254.280438] brcmfmac: brcmf_cfg80211_get_tx_power: error (-5)
[ 8254.280491] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing
to do.
[ 8254.280498] brcmfmac: brcmf_link_down: WLC_DISASSOC failed (-5)
[ 8254.800394] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing
to do.
[ 8254.803873] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing
to do.
[ 8254.808353] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing
to do.
[ 8254.808370] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-5)
[ 8254.881402] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing
to do.
[ 8254.881420] brcmfmac: brcmf_cfg80211_reg_notifier: Country code iovar
returned err = -5
[ 8255.001550] usbcore: deregistering interface driver brcmfmac
[ 8255.071184] brcmfmac: F1 signature read @0x18000000=0x15264345
[ 8255.077098] brcmfmac: brcmf_fw_map_chip_to_name: using
brcm/brcmfmac43455-sdio.bin for chip 0x004345(17221) rev 0x000006
[ 8255.077348] usbcore: registered new interface driver brcmfmac
[ 8257.730418] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[ 8257.751038] brcmfmac: brcmf_c_get_clm_name: retrieving revision info
failed (-110)
[ 8257.751049] brcmfmac: brcmf_c_process_clm_blob: get CLM blob file name
failed (-110)
[ 8257.751068] brcmfmac: brcmf_c_preinit_dcmds: download CLM blob file
failed, -110
[ 8257.751076] brcmfmac: brcmf_bus_started: failed: -110
[ 8257.751114] brcmfmac: brcmf_sdio_firmware_callback: dongle is not
responding
[ 8304.417684] usbcore: deregistering interface driver brcmfmac
[ 8304.486099] brcmfmac: F1 signature read @0x18000000=0x15264345
[ 8304.493613] brcmfmac: brcmf_fw_map_chip_to_name: using
brcm/brcmfmac43455-sdio.bin for chip 0x004345(17221) rev 0x000006
[ 8304.494078] usbcore: registered new interface driver brcmfmac
[ 8304.686761] brcmfmac: brcmf_c_preinit_dcmds: Firmware version = wl0: Feb
27 2018 03:15:32 version 7.45.154 (r684107 CY) FWID 01-4fbe0b04
[ 8304.687203] brcmfmac: brcmf_c_preinit_dcmds: CLM version = API: 12.2
Data: 9.10.105 Compiler: 1.29.4 ClmImport: 1.36.3 Creation: 2018-03-09
18:56:28
[ 8304.829994] brcmfmac: power management disabled
[ 8304.907662] brcmfmac: power management disabled
[ 8357.441791] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content:
0x40012
[ 8359.891146] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[ 8359.891655] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
[ 8359.891668] brcmfmac: brcmf_run_escan: error (-110)
[ 8359.891677] brcmfmac: brcmf_cfg80211_scan: scan error (-110)
[ 8371.731226] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[ 8371.731731] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
[ 8371.731746] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-110)
[ 8373.941267] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg
failed w/status -5
[ 8373.941280] brcmfmac: brcmf_cfg80211_get_tx_power: error (-5)
[ 8373.941330] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing
to do.
[ 8373.941338] brcmfmac: brcmf_link_down: WLC_DISASSOC failed (-5)
[ 8374.461245] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing
to do.
[ 8374.461942] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing
to do.
[ 8374.463553] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing
to do.
[ 8374.463573] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-5)
[ 8374.564729] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing
to do.
[ 8374.564750] brcmfmac: brcmf_cfg80211_reg_notifier: Country code iovar
returned err = -5
[ 8374.702401] usbcore: deregistering interface driver brcmfmac
[ 8374.759839] brcmfmac: F1 signature read @0x18000000=0x15264345
[ 8374.767561] brcmfmac: brcmf_fw_map_chip_to_name: using
brcm/brcmfmac43455-sdio.bin for chip 0x004345(17221) rev 0x000006
[ 8374.771137] usbcore: registered new interface driver brcmfmac
[ 8377.411255] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[ 8377.431924] brcmfmac: brcmf_c_get_clm_name: retrieving revision info
failed (-110)
[ 8377.431934] brcmfmac: brcmf_c_process_clm_blob: get CLM blob file name
failed (-110)
[ 8377.431941] brcmfmac: brcmf_c_preinit_dcmds: download CLM blob file
failed, -110
[ 8377.431949] brcmfmac: brcmf_bus_started: failed: -110
[ 8377.432003] brcmfmac: brcmf_sdio_firmware_callback: dongle is not
responding
[ 8424.133114] usbcore: deregistering interface driver brcmfmac
[ 8424.229631] brcmfmac: F1 signature read @0x18000000=0x15264345
[ 8424.237210] brcmfmac: brcmf_fw_map_chip_to_name: using
brcm/brcmfmac43455-sdio.bin for chip 0x004345(17221) rev 0x000006
[ 8424.239352] usbcore: registered new interface driver brcmfmac
[ 8424.460736] brcmfmac: brcmf_c_preinit_dcmds: Firmware version = wl0: Feb
27 2018 03:15:32 version 7.45.154 (r684107 CY) FWID 01-4fbe0b04
[ 8424.461174] brcmfmac: brcmf_c_preinit_dcmds: CLM version = API: 12.2
Data: 9.10.105 Compiler: 1.29.4 ClmImport: 1.36.3 Creation: 2018-03-09
18:56:28
[ 8424.646993] brcmfmac: power management disabled
[ 8424.708633] brcmfmac: power management disabled

On Sun, Mar 18, 2018 at 11:30 AM, James Hughes notifications@github.com
wrote:

OK, so the bug that we thought Cypress had fixed is still there. Back to
Cypress it goes. Took a year to get this version. Holding breath not
recommended.

Must confirm the version though, please post contents of

dmesg | grep brcmfmac

On 18 March 2018 at 01:44, Rebroad notifications@github.com wrote:

same error messages as @flok99 https://github.com/flok99 - using
latest
firmware (rpi-update) on stretch.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<https://github.com/raspberrypi/linux/issues/1342#issuecomment-373966343
,
or mute the thread
kn9pvrZdgy32mTignlmks5tfbvwgaJpZM4HupC5>
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-373987387,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADESuI3-T3HmNWHKLTeApQsVRkxFmNUBks5tfjdhgaJpZM4HupC5
.

--
www.vanheusden.com www.slimwinnen.nl www.winnenmetbitcoin.nl

www.aliensdetected.com www.benjeeigenbank.nl www.depersoonlijkebank.nl

www.hackerspace-gouda.nl www.ismijnwebsitekapot.nl www.micro-twin.com

www.slimmetvalutahandelen.nl www.slimwinstmaken.nl www.vertrouwdbankieren.nl

www.watismijnip.info

@flok99

brcmfmac: brcmf_fw_map_chip_to_name: using brcm/brcmfmac43455-sdio.bin for chip 
Firmware version = wl0: Feb 27 2018 03:15:32 version 7.45.154 (r684107 CY) FWID 01-4fbe0b04 

Pretty much looks like you are on newer Pi3b+ and not on original Pi3: so maybe different matter?

Entirely different chip and firmware, although the Linux side driver is the
same. (brcmfmac).

On 19 March 2018 at 16:26, macmpi notifications@github.com wrote:

@flok99 https://github.com/flok99

brcmfmac: brcmf_fw_map_chip_to_name: using brcm/brcmfmac43455-sdio.bin for chip
Firmware version = wl0: Feb 27 2018 03:15:32 version 7.45.154 (r684107 CY) FWID 01-4fbe0b04

Pretty much looks like you are on newer Pi3b+ and not on original Pi3: so
maybe different matter?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/raspberrypi/linux/issues/1342#issuecomment-374274045,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADqrHeP6-sc-P-OSggQFPrl3O8z_B2aRks5tf9wbgaJpZM4HupC5
.

--
James Hughes
Principal Software Engineer,
Raspberry Pi (Trading) Ltd

I think its best to have another thread for Pi3B+ issues, and refer back to this one as necessary, otherwise its is going to be very difficult to track. Can @flok99 please create a new issues with his reports, ensuring the title refers to the 3b+. I'll change the title of this one to reflect its for Pi3B only.

done

Does anyone subscribed to this issue, running the 3B (not plus), still see the problems with the latest firmware and kernel? Would like any reports of continuing failure - last posts on subject above seem to imply things are now working OK.

My 3B's up since 44 days with this:

Linux rpi3 4.14.24-v7+ #1097 SMP Mon Mar 5 16:42:05 GMT 2018
brcmf_c_preinit_dcmds: Firmware version = wl0: Oct 23 2017 03:55:53 version 7.45.98.38 (r674442 CY) FWID 01-e58d219f

No problems since then..

Good news. Unless I hear otherwise, I'll probably close this thread in a week or two, although it can be reopened at any time if problems reoccur.

I started having this issue about a week ago, having not heard of it before then. I also use the pi most often with a samsung phone as router - mine is an s4. I am writing this connected direct to the s4 with usb, ie using rndis. Here are my details from today's boot:
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
thenry@pi3portable:~ $ dmesg | grep brcmfmac
[ 9.965782] brcmfmac: F1 signature read @0x18000000=0x1541a9a6
[ 9.972059] brcmfmac: brcmf_fw_map_chip_to_name: using brcm/brcmfmac43430-sdio.bin for chip 0x00a9a6(43430) rev 0x000001
[ 9.972250] usbcore: registered new interface driver brcmfmac
[ 10.147562] brcmfmac: brcmf_c_preinit_dcmds: Firmware version = wl0: Aug 7 2017 00:46:29 version 7.45.41.46 (r666254 CY) FWID 01-f8a78378
[ 10.148507] brcmfmac: brcmf_c_preinit_dcmds: CLM version = API: 12.2 Data: 7.11.15 Compiler: 1.24.2 ClmImport: 1.24.1 Creation: 2014-05-26 10:53:55 Inc Data: 9.10.41 Inc Compiler: 1.29.4 Inc ClmImport: 1.36.3 Creation: 2017-08-07 00:37:47
[ 18.538641] brcmfmac: power management disabled
[ 30.629545] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
[ 33.191450] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[ 33.194850] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
[ 35.751496] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[ 35.754898] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
[ 35.754906] brcmfmac: brcmf_pno_clean: failed code -110
[ 43.271438] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[ 43.274800] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
[ 43.274807] brcmfmac: brcmf_do_escan: error (-110)
[ 43.274811] brcmfmac: brcmf_cfg80211_scan: scan error (-110)
[ 7673.758073] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[ 7673.761437] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
[ 7673.761454] brcmfmac: _brcmf_set_multicast_list: Setting mcast_list failed, -110
[ 7676.328075] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[ 7676.331449] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
[ 7676.331466] brcmfmac: _brcmf_set_multicast_list: Setting allmulti failed, -110
[ 7678.878084] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[ 7678.881460] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
[ 7681.448101] brcmfmac: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -110
[ 7689.118098] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[ 7689.118241] brcmfmac: power management disabled
[ 7691.678100] brcmfmac: brcmf_cfg80211_set_power_mgmt: error (-110)
[ 7694.238122] brcmfmac: _brcmf_set_multicast_list: Setting mcast_list failed, -110
[ 7696.798118] brcmfmac: _brcmf_set_multicast_list: Setting allmulti failed, -110
[ 7699.358158] brcmfmac: brcmf_do_escan: error (-110)
[ 7699.358167] brcmfmac: brcmf_cfg80211_scan: scan error (-110)
[ 7701.918127] brcmfmac: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -110
[11406.881341] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[11406.881352] brcmfmac: brcmf_cfg80211_reg_notifier: Country code iovar returned err = -110
[11579.921479] brcmfmac: _brcmf_set_multicast_list: Setting mcast_list failed, -110
[11582.491485] brcmfmac: _brcmf_set_multicast_list: Setting allmulti failed, -110
[11587.611478] brcmfmac: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -110
thenry@pi3portable:~ $
thenry@pi3portable:~ $ uname -a
Linux pi3portable 4.14.27-v7+ #1100 SMP Fri Mar 16 13:51:48 GMT 2018 armv7l GNU/Linux
thenry@pi3portable:~ $
I am running this kernel because I changed to the next stream when I was testing booting from usb, and didn't change back afterwards. Then I got the notice about the new kernel (4.14) so decided to try that, about a month ago. It has been fine, no problems until this one. Only other major change is I switched from NetworkManager to systemd-networkd several days ago but that is after this problem first showed itself.
Regards,
Trevor Henry

Update:
After I read all the related posts I found the latest firmware in the post https://www.raspberrypi.org/forums/viewtopic.php?f=63&t=203508
and this fixed my problem.

test version of brcmfmas43430-sdio.bin installed 250418

version 7.45.98.38 Oct 23 2017, replaced version 7.45.41.46 Aug 7 2017

before:

[ 10.368086] brcmfmac: F1 signature read @0x18000000=0x1541a9a6
[ 10.376702] brcmfmac: brcmf_fw_map_chip_to_name: using brcm/brcmfmac43430-sdio.bin for chip 0x00a9a6(43430) rev 0x000001
[ 10.377026] usbcore: registered new interface driver brcmfmac
[ 10.599523] brcmfmac: brcmf_c_preinit_dcmds: Firmware version = wl0: Aug 7 2017 00:46:29 version 7.45.41.46 (r666254 CY) FWID 01-f8a78378
[ 10.600577] brcmfmac: brcmf_c_preinit_dcmds: CLM version = API: 12.2 Data: 7.11.15 Compiler: 1.24.2 ClmImport: 1.24.1 Creation: 2014-05-26 10:53:55 Inc Data: 9.10.41 Inc Compiler: 1.29.4 Inc ClmImport: 1.36.3 Creation: 2017-08-07 00:37:47
[ 126.642710] brcmfmac: power management disabled
[ 139.249230] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
[ 141.751545] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[ 141.754973] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
[ 144.311482] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[ 144.314959] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
[ 144.314975] brcmfmac: brcmf_pno_clean: failed code -110
[ 151.831564] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[ 151.835066] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
[ 151.835079] brcmfmac: brcmf_do_escan: error (-110)
[ 151.835084] brcmfmac: brcmf_cfg80211_scan: scan error (-110)

after:

thenry@pi3portable:~ $ dmesg | grep brcm
[ 10.115833] brcmfmac: F1 signature read @0x18000000=0x1541a9a6
[ 10.134926] brcmfmac: brcmf_fw_map_chip_to_name: using brcm/brcmfmac43430-sdio.bin for chip 0x00a9a6(43430) rev 0x000001
[ 10.135115] usbcore: registered new interface driver brcmfmac
[ 10.367703] brcmfmac: brcmf_c_preinit_dcmds: Firmware version = wl0: Oct 23 2017 03:55:53 version 7.45.98.38 (r674442 CY) FWID 01-e58d219f
[ 10.368419] brcmfmac: brcmf_c_preinit_dcmds: CLM version = API: 12.2 Data: 7.11.15 Compiler: 1.24.2 ClmImport: 1.24.1 Creation: 2014-05-26 10:53:55 Inc Data: 9.10.39 Inc Compiler: 1.29.4 Inc ClmImport: 1.36.3 Creation: 2017-10-23 03:47:14
[ 18.045308] brcmfmac: power management disabled
thenry@pi3portable:~ $

It has continued to work through several boots and I am now using it, connected by wifi to samsung s4 phone.
Thanks for your help, regards, Trevor Henry.

I thought the latest firmware was already in the latest images, so would have expected that an upgrade to 4.14 would have brought the latest firmware in. Did you build you own kernel?

Yes - the current Raspbian images have firmware 7.45.98.38 from Oct 23 2017.

Hi, no I didn't build the kernel, I upgraded with rpi-update, and as you can see it was still running the Aug 2017 firmware after the update.

rpi-update only upgrades the kernel, firmware and a small number of VideoCore-specific utilities. To upgrade everything, including the WiFi firmware, you must use apt-get upgrade/distupgrade.

Hi,
So i have this issue and it is better with the latest FW, 7.45.98.38, than it was but I still have problems.
Observations
If I boot the raspberry without doing anything then the WLAN comes up as it should.
If I try to use the bluetooth keyboard or mouse before the WLAN is up then the problem persists, i get no connection.
If I have a connection and disable/enable wireless network then the WLAN does not connect.
If i leave the WLAN on over night then the connection stops working.
I have three identical setups and the behaviour is the same on all of them.
Do not know if it matters but we are using WPA2 enterprise, PEAP and MSCHAPv2

Do these issues only happen when the the BT devices are connected?

Yes! Disabled bluetooth and connected usb keyboards and mouse and the WLAN connected faster than i have ever seen before.

Still some issues with coexist then. Will need to be flagged up to Cypress I guess.

Just to check, you are using the latest Raspbian? Or something pretty new?

@pelwell ping

Description: Raspbian GNU/Linux 9.4 (stretch)
Do you need any more info?

It hangs after :
May 14 15:43:58 hwlab1_gul_rpi wpa_supplicant[445]: wlan0: CTRL-EVENT-EAP-METHOD EAP vendor 0 method 25 (PEAP) selected

see log snip below

May 14 15:43:58 hwlab1_gul_rpi NetworkManager[2745]: [1526305438.7887] device (wlan0): supplicant interface state: disconnected -> associating
May 14 15:43:58 hwlab1_gul_rpi wpa_supplicant[445]: wlan0: Associated with 44:d9:e7:f7:d5:34
May 14 15:43:58 hwlab1_gul_rpi wpa_supplicant[445]: wlan0: CTRL-EVENT-EAP-STARTED EAP authentication started
May 14 15:43:58 hwlab1_gul_rpi NetworkManager[2745]: [1526305438.9263] device (wlan0): supplicant interface state: associating -> associated
May 14 15:43:58 hwlab1_gul_rpi wpa_supplicant[445]: wlan0: CTRL-EVENT-EAP-PROPOSED-METHOD vendor=0 method=25
May 14 15:43:58 hwlab1_gul_rpi wpa_supplicant[445]: wlan0: CTRL-EVENT-EAP-METHOD EAP vendor 0 method 25 (PEAP) selected
May 14 15:44:24 hwlab1_gul_rpi NetworkManager[2745]: [1526305464.0716] device (wlan0): Activation: (wifi) association took too long
May 14 15:44:24 hwlab1_gul_rpi NetworkManager[2745]: [1526305464.0718] device (wlan0): state change: config -> need-auth (reason 'none') [50 60 0]
May 14 15:44:24 hwlab1_gul_rpi wpa_supplicant[445]: wlan0: CTRL-EVENT-DISCONNECTED bssid=44:d9:e7:f7:d5:34 reason=3 locally_generated=1
May 14 15:44:24 hwlab1_gul_rpi NetworkManager[2745]: [1526305464.0937] device (wlan0): Activation: (wifi) asking for new secrets
May 14 15:44:24 hwlab1_gul_rpi NetworkManager[2745]: [1526305464.0959] sup-iface[0x1c438c0,wlan0]: connection disconnected (reason -3)

I have the same issue with octoPi 0.14 (every package updated, rpi firmware at latest, every octoprint plugin updated).

brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
brcmfmac: brcmf_cfg80211_get_station: GET STA INFO failed, -110

with this setup its 100% reproducable. Accassing the octoprint website on the pi (first time access after boot) from my samsung s4 active (android 5.0.1, using chrome) or from my samsung tablet 10inch note thing with also android 5.x i guess and chrome kills the wifi when the page is half loaded.
No cable connected to my Pi3, wifi on channel 11 with wpa2.
I tried disabling wifi power thing and switching to wifi channel 6 without any luck (tipps from above) - however I had the feeling it was a bit better with channel 6.

But now comes the interesting clue on the bug:
I have no issue when I open the octopi/octoprint site (on the pi) from my windows 10 or ubuntu 16 machine (using chrome, cable connection to the router). My guess is now it is a android, samsung or wifi to wifi related bug. And I think I have read something about android/rpi issues a while back.

Hope this helps. If you need a tester for some version I would give it a try.

Just thought I would chime here and say we have also seen what looks to be like WiFi related blocking stalls around this driver which may be related on another SBC. It's not Raspberry PI specific.

This is happening to me too.

Setup

  • Pi 3B 1.2 (a02082)
  • Kernel:
pi@raspberrypi:~ $ uname -a
Linux raspberrypi 4.14.54-v7+ #1126 SMP Wed Jul 11 20:01:03 BST 2018 armv7l GNU/Linux

Running Raspbian version 9.4:

pi@raspberrypi:~ $ cat /etc/debian_version
9.4

Firmware version:

pi@raspberrypi:~ $ /opt/vc/bin/vcgencmd version
Jul  9 2018 19:35:54
Copyright (c) 2012 Broadcom
version daa7178a0900fd9a743c019f9dad7889d531e71d (clean) (release)

wlan0 Power management is turned off:

pi@raspberrypi:~ $ iwconfig wlan0
wlan0     IEEE 802.11  ESSID:"VIRUS_2.4"
          Mode:Managed  Frequency:2.462 GHz  Access Point: D4:7B:B0:79:AF:A6
          Bit Rate=72.2 Mb/s   Tx-Power=31 dBm
          Retry short limit:7   RTS thr:off   Fragment thr:off
          Power Management:off
          Link Quality=47/70  Signal level=-63 dBm
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:120  Invalid misc:0   Missed beacon:0

I'm using the built-in wifi. Nothing is connected to the ethernet port.

System has been upgraded using apt-get upgrade, apt-get dist-upgrade and rpi-update.

What I see

After the pi has been up for about an hour, it becomes unreachable from the network. I can't reach the Pi from my local network (ping and ssh don't work).

In dmesg, I see that I get:

brcmfmac: power management disabled
...
snd_bcm2835: module is from the staging directory, the quality is unknown, you have been warned

But no errors.

Something interesting

I noticed that when this happens, if I connect to the pi directly and ping my laptop - things get back to work. Also, ping times are a bit odd - seems like it takes a little time to get things to 'warm up':

pi@raspberrypi:~ $ ping 192.168.1.22
PING 192.168.1.22 (192.168.1.22) 56(84) bytes of data.
64 bytes from 192.168.1.22: icmp_seq=1 ttl=64 time=5024 ms
64 bytes from 192.168.1.22: icmp_seq=2 ttl=64 time=4010 ms
64 bytes from 192.168.1.22: icmp_seq=3 ttl=64 time=2971 ms
64 bytes from 192.168.1.22: icmp_seq=4 ttl=64 time=1932 ms
64 bytes from 192.168.1.22: icmp_seq=5 ttl=64 time=892 ms
64 bytes from 192.168.1.22: icmp_seq=6 ttl=64 time=5.63 ms
64 bytes from 192.168.1.22: icmp_seq=7 ttl=64 time=12.4 ms
64 bytes from 192.168.1.22: icmp_seq=8 ttl=64 time=5.59 ms
64 bytes from 192.168.1.22: icmp_seq=9 ttl=64 time=55.5 ms

If anyone needs any more information, I'd be happy to supply it.

@bugok, does setting the network interface to promiscuous alleviate the problem for you? (ifconfig wlan0 promisc).

@quozl: It didn't help. After a while, ping started to fail:

$ ping 192.168.1.80
PING 192.168.1.80 (192.168.1.80): 56 data bytes
ping: sendto: No route to host
ping: sendto: Host is down
Request timeout for icmp_seq 0
...

Reporting back: My issue seems to be resolved, and seems to be unrelated to the problem in this thread.

Details here, but the gist is that I set a static IP on the Pi itself (in /etc/dhcpcd.conf). After defining the static IP in the router, removing the static IP config from /etc/dhcpcd.conf and rebooting - things seem to work.

A quick update: this issue ("Unknown mailbox data content" error accompanied by complete wireless lockup) persists, on the latest firmware with all updates installed (dist-upgrade).

Changing a single line in the hostapd.conf file (as per my previous comment) still eliminates the issue for me.

Using Rpi3B with kernel 4.14.52-v7 (raspberrypi-kernel 1.20180703-1) and (firmware-brcm80211 1:20161130-3+rpt4)
I am also still facing the problem where wlan freezes (90 devices out of which 2 per day have the issue). In some cases the adapter is missing and in other it is not responding. I am not using the Pi in AP mode, just
I tried to rebind it as in RPi-3B+ but with no success.

I currently created a solution when no-network-connection is detected, the pi reboots. However, this is no proper solution and at least I am trying to reload the driver

I was consistently seeing this same problem on a previously working Pi 3. I realised that the only change I had made was to plug in an LCD touch screen, drawing power from the Pi. When I unplugged the touch screen, WiFi worked correctly. So it certainly seems to be power related. This was using the official Raspberry AC adaptor.

That's a very interesting data point. Was it one of our LCD's?

@JamesH65 I've also started to experience wifi crashes and latency spikes after i installed https://www.waveshare.com/wiki/5inch_HDMI_LCD, i have a 3b+ a rpi cam v2 and the display, hooked to a 3amp psu, i dont dont get any power warnings...

Hi guys, any update on this? I was trying to use raspivid on a zero W with a TCP stream and after some minutes my Wi-Fi is gone, I guess it's the same issue.

I haven't had the problem in at least a year or so. I'm beginning to increasingly think that it may be simply related to the USB power source not providing enough amps, but I would welcome proof that this is not the case. As a test, try plugging your USB cable into a higher amp adapter, especially if you can easily reproduce the problem.

I am quite sure its not directly amp related because I only use about 2amp supplies. mostly old samsung chargers. however it could be ripple or something with the power supply or pi hardware.Von meinem Samsung Gerät gesendet.

-------- Ursprüngliche Nachricht --------
Von: rajid notifications@github.com
Datum: 07.04.2019 02:15 (GMT+01:00)
An: raspberrypi/linux linux@noreply.github.com
Cc: "A. Binzxxxxxx" alexander@binzberger.de, Comment comment@noreply.github.com
Betreff: Re: [raspberrypi/linux] wlan freezes in raspberry pi 3/PiZeroW (Not
3B+) (#1342)

I haven't had the problem in at least a year or so. I'm beginning to increasingly think that it may be simply related to the USB power source not providing enough amps, but I would welcome proof that this is not the case. As a test, try plugging your USB cable into a higher amp adapter, especially if you can easily reproduce the problem.

—You are receiving this because you commented.Reply to this email directly, view it on GitHub, or mute the thread.
{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/raspberrypi/linux","title":"raspberrypi/linux","subtitle":"GitHub repository","main_image_url":"https://github.githubassets.com/images/email/message_cards/header.png","avatar_image_url":"https://github.githubassets.com/images/email/message_cards/avatar.png","action":{"name":"Open in GitHub","url":"https://github.com/raspberrypi/linux"}},"updates":{"snippets":[{"icon":"PERSON","message":"@rajid in #1342: I haven't had the problem in at least a year or so. I'm beginning to increasingly think that it may be simply related to the USB power source not providing enough amps, but I would welcome proof that this is not the case. As a test, try plugging your USB cable into a higher amp adapter, especially if you can easily reproduce the problem."}],"action":{"name":"View Issue","url":"https://github.com/raspberrypi/linux/issues/1342#issuecomment-480547753"}}}
[
{
"@context": "http://schema.org",
"@type": "EmailMessage",
"potentialAction": {
"@type": "ViewAction",
"target": "https://github.com/raspberrypi/linux/issues/1342#issuecomment-480547753",
"url": "https://github.com/raspberrypi/linux/issues/1342#issuecomment-480547753",
"name": "View Issue"
},
"description": "View this Issue on GitHub",
"publisher": {
"@type": "Organization",
"name": "GitHub",
"url": "https://github.com"
}
}
]

I'm still having the issue, but not nearly as often - maybe ever few weeks - and I can no longer reliably induce it by connecting from Samsung android devices.

I am actually powering the Pi zero w with a 3A usb power supply and a 15cm usb cable used to charge powerbanks (no data lines, just power lines)

If I use the connection regularly (like a regular user) then it works fine, but if I stream MJPEG at 5Mbps it crashes after few minutes, I see some mailbox (or similar) error on journalct (can't remember as I'm out of home for a week), ssh stops, no ping, Wi-Fi drops, iwconfig takes few seconds to show results and they are almost empty.

@vascojdb If you are using the Pi as an access point (AP mode), then this workaround (see the bold text on the bottom) should solve your issue.

Let us know?

No, it's not in AP mode, I am connected to my home 2.4GHz Wi-Fi network

hello,

I have a wifi issue to sync time at startup with NTP server by using LibreELEC on RPI3B+ since the version 9.0.0.
After some discussions with some LE team members (see here), the issue has been fixed following this modification.

But it seems that this workaround was reverted and the issue is still present.
Is it possible to fix it again?

No one to answer or escalate this issue?

Same issue. Any news on this?

Apr 29 22:47:04 raspberrypi kernel: [37515.093582] brcmfmac: brcmf_sdio_hostmail: mailbox indicates firmware halted

Apr 29 22:47:06 raspberrypi kernel: [37517.524316] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout

Apr 29 22:47:06 raspberrypi kernel: [37517.524776] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle

Apr 29 22:47:06 raspberrypi kernel: [37517.524792] brcmfmac: brcmf_run_escan: error (-110)

Apr 29 22:47:06 raspberrypi kernel: [37517.524807] brcmfmac: brcmf_cfg80211_scan: scan error (-110)

Tried turning off power management for now. Is this an old bug reintroduced?

https://patchwork.kernel.org/patch/9948825/

Same issue. Any news on this?

This message only say that the firmware of the wifi chip crashed. There is nothing the Linux kernel can do except of reseting. A helpful bug report contains the following information:

Which wifi firmware you are using?
How do you operate the wifi (AP, client, ...)?
Can you reproduce this within a defined time?
What other wifi devices are involved?

its in my last comment as it was reproducable back then but its a bad one to reproduce and changes with software changes when it crashes.

-------- Ursprüngliche Nachricht --------
Von: Stefan Wahren notifications@github.com
Datum: 01.05.2020 10:21 (GMT+01:00)
An: raspberrypi/linux linux@noreply.github.com
Cc: "A. Binzxxxxxx" alexander@binzberger.de, Comment comment@noreply.github.com
Betreff: Re: [raspberrypi/linux] wlan freezes in raspberry pi 3/PiZeroW (Not
3B+) (#1342)

Same issue. Any news on this?

This message only say that the firmware of the wifi chip crashed. There is nothing the Linux kernel can do except of reseting. A helpful bug report contains the following information:
Which wifi firmware you are using?
How do you operate the wifi (AP, client, ...)?
Can you reproduce this within a defined time?
What other wifi devices are involved?

—You are receiving this because you commented.Reply to this email directly, view it on GitHub, or unsubscribe.
[
{
"@context": "http://schema.org",
"@type": "EmailMessage",
"potentialAction": {
"@type": "ViewAction",
"target": "https://github.com/raspberrypi/linux/issues/1342#issuecomment-622296815",
"url": "https://github.com/raspberrypi/linux/issues/1342#issuecomment-622296815",
"name": "View Issue"
},
"description": "View this Issue on GitHub",
"publisher": {
"@type": "Organization",
"name": "GitHub",
"url": "https://github.com"
}
}
]

Same issue. Any news on this?

Apr 29 22:47:04 raspberrypi kernel: [37515.093582] brcmfmac: brcmf_sdio_hostmail: mailbox indicates firmware halted

Apr 29 22:47:06 raspberrypi kernel: [37517.524316] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout

Apr 29 22:47:06 raspberrypi kernel: [37517.524776] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle

Apr 29 22:47:06 raspberrypi kernel: [37517.524792] brcmfmac: brcmf_run_escan: error (-110)

Apr 29 22:47:06 raspberrypi kernel: [37517.524807] brcmfmac: brcmf_cfg80211_scan: scan error (-110)

Tried turning off power management for now. Is this an old bug reintroduced?

https://patchwork.kernel.org/patch/9948825/

No solution to speak of, but I have exactly the same problem on a Rpi4 with latest firmware installed. I rolled it back to an SD card image I made a few months ago and the problem is gone. As I'm using hostapd, I believe one of or a combination of these upgrades broke it for me:

$ apt list --upgradeable
Listing... Done
...
hostapd/stable 2:2.7+git20190128+0c1e29f-6+deb10u2 armhf [upgradable from: 2:2.7+git20190128+0c1e29f-6+deb10u1]
firmware-brcm80211/testing 1:20190114-1+rpt6 all [upgradable from: 1:20190114-1+rpt4]
raspberrypi-kernel/testing 1.20200212-1 armhf [upgradable from: 1.20200114-1]
...

I tried turning off power management also (and confirmed it was off with iwconfig), but it had no effect when running hostapd. I will have to forego firmware upgrades until it's fixed, as we are sending out many of these and need stable APs for our customers.

Anyone experiencing firmware traps, timeouts (-110) etc. - please enable some firmware debugging so we can gather some data.

Add brcmfmac.debug=0x100000 to /boot/cmdline.txt, keeping it in a single, long line, then reboot. Running dmesg | grep brcmfmac should result in output like this:

[    7.650239] brcmfmac: CONSOLE: d 0
[    7.650256] brcmfmac: CONSOLE: 000000.063 wl0: Broadcom BCM4345 802.11 Wireless Controller 7.45.202 (r724630 CY)
[    7.650270] brcmfmac: CONSOLE: 000000.064 TCAM: 256 used: 252 exceed:0
[    7.650284] brcmfmac: CONSOLE: 000000.065 reclaim section 1: Returned 122844 bytes to the heap
[    7.650297] brcmfmac: CONSOLE: 000000.065 reclaim section 4: Returned 44 bytes to the heap
[    7.650310] brcmfmac: CONSOLE: 000000.065 sdpcmd_dpc: Enable
...

Then just carry on as normal. When the brcmfmac firmware dies, capture the output of dmesg to a file and attach it (or a link to pastebin, etc.) here.

Since the failure triggers other kernel messages, there is a danger that the useful output is lost before you have chance to capture it. A way to avoid this is to leave a shell constantly saving the kernel messages to a file:

$ dmesg -w > kernel_log.txt &

Seeing the same issue here. Will try the debug mentioned above.

Running hostapd in AP mode, wireguard, and frr. Also using Sixfab cellular hat.

[46972.803286] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[46975.363309] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[46975.363322] brcmfmac: brcmf_cfg80211_get_station: GET STA INFO failed, -110
[47292.885392] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[47295.445423] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[47295.445436] brcmfmac: brcmf_cfg80211_get_station: GET STA INFO failed, -110
[47602.007429] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[47604.567452] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[47604.567465] brcmfmac: brcmf_cfg80211_get_station: GET STA INFO failed, -110
[47830.248947] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[47838.328989] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[47887.049300] brcmfmac: brcmf_cfg80211_del_station: SCB_DEAUTHENTICATE_FOR_REASON failed -110
[47892.649358] brcmfmac: brcmf_cfg80211_stop_ap: SET SSID error (-110)
[47895.209353] brcmfmac: brcmf_cfg80211_stop_ap: BRCMF_C_DOWN error -110
[47897.769374] brcmfmac: brcmf_cfg80211_stop_ap: setting AP mode failed -110
[47902.889420] brcmfmac: brcmf_cfg80211_stop_ap: BRCMF_C_UP error -110
[47905.449430] brcmfmac: brcmf_set_mpc: fail to set mpc
Linux raspberrypi 4.19.118-v7+ #1311 SMP Mon Apr 27 14:21:24 BST 2020 armv7l GNU/Linux

I am also able to recreate this on the 5.4 branch. FWIW, I can always manually trigger this bug by SCPing a large file (>400MB) to my Pi Zero W.

If it helps, my kernel version is as of this commit - https://github.com/raspberrypi/linux/commit/3c860a6fd128e7cf1c39b3f51258a2a078d1a1a4

# uname -a
Linux pichime-1-c93bb27a 5.4.50 #1 Sun Jul 12 20:53:57 CDT 2020 armv6l GNU/Linux
# dmesg | grep brcmfmac | grep Firmware
[    5.319134] brcmfmac: brcmf_c_preinit_dcmds: Firmware: BCM43430/1 wl0: May  2 2019 02:39:18 version 7.45.98.83 (r714225 CY) FWID 01-e539531f

Crash Log with Debugging:

[  340.321646] ieee80211 phy1: _brcmf_set_multicast_list: Setting mcast_list failed, -110
[  342.881642] ieee80211 phy1: _brcmf_set_multicast_list: Setting allmulti failed, -110
[  345.441616] ieee80211 phy1: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[  348.001649] ieee80211 phy1: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -110
[  358.241623] ieee80211 phy1: brcmf_cfg80211_disconnect: error (-110)
[  363.361640] ieee80211 phy1: _brcmf_set_multicast_list: Setting mcast_list failed, -110
[  371.041641] ieee80211 phy1: _brcmf_set_multicast_list: Setting allmulti failed, -110
[  373.601642] ieee80211 phy1: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -110
[  376.161620] ieee80211 phy1: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[  376.170775] ieee80211 phy1: brcmf_cfg80211_reg_notifier: Country code iovar returned err = -110
[  383.841632] ieee80211 phy1: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[  383.851056] brcmfmac: brcmf_cfg80211_set_power_mgmt: power save disabled
[  386.401643] ieee80211 phy1: _brcmf_set_multicast_list: Setting mcast_list failed, -110
[  388.961642] ieee80211 phy1: _brcmf_set_multicast_list: Setting allmulti failed, -110
[  391.521632] ieee80211 phy1: brcmf_cfg80211_set_power_mgmt: error (-110)
[  394.081651] ieee80211 phy1: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -110
[  409.521619] ieee80211 phy1: brcmf_run_escan: error (-110)
[  409.527146] ieee80211 phy1: brcmf_cfg80211_scan: scan error (-110)
[  412.081641] ieee80211 phy1: _brcmf_set_multicast_list: Setting mcast_list failed, -110
[  414.641643] ieee80211 phy1: _brcmf_set_multicast_list: Setting allmulti failed, -110
[  417.201652] ieee80211 phy1: brcmf_run_escan: error (-110)
[  417.207175] ieee80211 phy1: brcmf_cfg80211_scan: scan error (-110)
[  419.761655] ieee80211 phy1: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -110
[  424.881645] ieee80211 phy1: brcmf_run_escan: error (-110)
[  424.887168] ieee80211 phy1: brcmf_cfg80211_scan: scan error (-110)
[  430.001645] ieee80211 phy1: _brcmf_set_multicast_list: Setting mcast_list failed, -110
[  432.561651] ieee80211 phy1: brcmf_run_escan: error (-110)
[  432.567172] ieee80211 phy1: brcmf_cfg80211_scan: scan error (-110)
[  435.121637] ieee80211 phy1: _brcmf_set_multicast_list: Setting allmulti failed, -110
[  437.681648] ieee80211 phy1: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -110
[  440.241651] ieee80211 phy1: brcmf_run_escan: error (-110)
[  440.247173] ieee80211 phy1: brcmf_cfg80211_scan: scan error (-110)
[  447.921623] ieee80211 phy1: brcmf_run_escan: error (-110)
[  447.927145] ieee80211 phy1: brcmf_cfg80211_scan: scan error (-110)

During the above crash I ran an ifdown and ifup which didn't restore wifi. Only solution is to either reboot the pi, or rmmod & modprobe brcmfmac.

It's worth noting this happens with power management turned off, since I have this in my interfaces file:

pre-up iwconfig wlan0 power off

That isn't the most recent firmware for the 43438 - we're now on:

Version: 7.45.98.94 (r723000 CY) CRC: ba33fa65 Date: Tue 2019-10-22 02:01:06 PDT Ucode Ver: 1043.2137 FWID 01-3b33decd

Try updating your firmware-brcm80211 package, or downloading the firmware from: https://github.com/RPi-Distro/firmware-nonfree/

If you still see errors, enable brcmfmac firmware logging by adding brcmfmac.debug=0x100000 to cmdline.txt.

@pelwell Sorry about that, but I updated and can still recreate the issue using the method I mentioned.

Note I enabled debugging as requested, but this is all I got:

[    0.000000] Kernel command line: root=/dev/mmcblk0p2 8250.nr_uarts=1 console=ttyS0,115200 rootwait earlyprintk brcmfmac.debug=0x100000
[    4.940560] brcmfmac: F1 signature read @0x18000000=0x1541a9a6
[    4.958767] brcmfmac: brcmf_fw_alloc_request: using brcm/brcmfmac43430-sdio for chip BCM43430/1
[    4.973290] usbcore: registered new interface driver brcmfmac
[    5.324551] brcmfmac: brcmf_fw_alloc_request: using brcm/brcmfmac43430-sdio for chip BCM43430/1
[    5.334223] brcmfmac: brcmf_c_process_clm_blob: no clm_blob available (err=-2), device may have limited channels available
[    5.347276] brcmfmac: brcmf_c_preinit_dcmds: Firmware: BCM43430/1 wl0: Oct 22 2019 01:59:28 version 7.45.98.94 (r723000 CY) FWID 01-3b33decd
[    5.443617] brcmfmac: CONSOLE: hndarm_armr addr: 0x18003000, cr4_idx: 0
[    5.443635] brcmfmac: CONSOLE: 000000.001
[    5.443646] brcmfmac: CONSOLE: RTE (SDIO-CDC) 7.45.98.94 (r723000 CY) on BCM43430 r1 @ 37.4/81.6/81.6MHz
[    5.443655] brcmfmac: CONSOLE: 000000.003 sdpcmdcdc0: Broadcom SDPCMD CDC driver
[    5.443665] brcmfmac: CONSOLE: 000000.008 reclaim section 0: Returned 46092 bytes to the heap
[    5.443673] brcmfmac: CONSOLE: 000000.012 wlc_bmac_info_init: host_enab 1
[    5.443684] brcmfmac: CONSOLE: 000000.064 wl0: Broadcom BCM43430 802.11 Wireless Controller 7.45.98.94 (r723000 CY)
[    5.443693] brcmfmac: CONSOLE: 000000.067 TCAM: 256 used: 212 exceed:0
[    5.443702] brcmfmac: CONSOLE: 000000.069 reclaim section 1: Returned 81228 bytes to the heap
[   51.183451] brcmfmac: CONSOLE: 000045.943 wl0: wl_open
[   51.213694] brcmfmac: brcmf_cfg80211_set_power_mgmt: power save disabled
[  260.001321] ieee80211 phy0: _brcmf_set_multicast_list: Setting mcast_list failed, -110
[  262.561331] ieee80211 phy0: _brcmf_set_multicast_list: Setting allmulti failed, -110
[  265.121296] ieee80211 phy0: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[  267.681321] ieee80211 phy0: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -110
[  275.361321] ieee80211 phy0: brcmf_cfg80211_disconnect: error (-110)
[  280.481324] ieee80211 phy0: _brcmf_set_multicast_list: Setting mcast_list failed, -110
[  285.601297] ieee80211 phy0: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[  285.610456] ieee80211 phy0: brcmf_cfg80211_reg_notifier: Country code iovar returned err = -110
[  288.161325] ieee80211 phy0: _brcmf_set_multicast_list: Setting allmulti failed, -110
[  290.721325] ieee80211 phy0: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -110
[  293.281314] ieee80211 phy0: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[  293.291034] brcmfmac: brcmf_cfg80211_set_power_mgmt: power save disabled
[  300.961315] ieee80211 phy0: brcmf_cfg80211_set_power_mgmt: error (-110)
[  306.081321] ieee80211 phy0: _brcmf_set_multicast_list: Setting mcast_list failed, -110
[  308.641320] ieee80211 phy0: _brcmf_set_multicast_list: Setting allmulti failed, -110
[  313.761330] ieee80211 phy0: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -110
[  324.001323] ieee80211 phy0: brcmf_run_escan: error (-110)
[  324.006845] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-110)
[  326.561329] ieee80211 phy0: _brcmf_set_multicast_list: Setting mcast_list failed, -110
[  329.121322] ieee80211 phy0: _brcmf_set_multicast_list: Setting allmulti failed, -110
[  331.681324] ieee80211 phy0: brcmf_run_escan: error (-110)
[  331.686848] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-110)
[  334.241329] ieee80211 phy0: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -110
[  339.361315] ieee80211 phy0: brcmf_run_escan: error (-110)
[  339.366836] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-110)
[  344.481323] ieee80211 phy0: _brcmf_set_multicast_list: Setting mcast_list failed, -110
[  347.041339] ieee80211 phy0: brcmf_run_escan: error (-110)
[  347.046862] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-110)
[  349.601345] ieee80211 phy0: _brcmf_set_multicast_list: Setting allmulti failed, -110
[  352.161310] ieee80211 phy0: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -110
[  354.721371] ieee80211 phy0: brcmf_run_escan: error (-110)
[  354.726896] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-110)
[  362.401325] ieee80211 phy0: brcmf_run_escan: error (-110)
[  362.406850] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-110)

I was able to get more of a log by doing an ifdown & ifup on wlan0, hopefully this helps somewhat:

[ 1420.259650] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-110)
[ 1423.774141] ieee80211 phy0: brcmf_run_escan: error (-110)
[ 1423.779662] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-110)
[ 1427.294190] ieee80211 phy0: brcmf_run_escan: error (-110)
[ 1427.299710] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-110)
[ 1430.814146] ieee80211 phy0: brcmf_run_escan: error (-110)
[ 1430.819668] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-110)
[ 1444.148281] ieee80211 phy0: brcmf_cfg80211_scan: Connecting: status (3)
[ 1445.157155] ieee80211 phy0: brcmf_cfg80211_scan: Connecting: status (3)
[ 1446.166847] ieee80211 phy0: brcmf_cfg80211_scan: Connecting: status (3)
[ 1447.176537] ieee80211 phy0: brcmf_cfg80211_scan: Connecting: status (3)
[ 1448.185305] ieee80211 phy0: brcmf_cfg80211_scan: Connecting: status (3)

...
ifdown and ifup
...

[ 2984.008316] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-52)
[ 2984.019327] ieee80211 phy0: brcmf_run_escan: error (-52)
[ 2984.024840] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-52)
[ 3005.603730] ieee80211 phy0: brcmf_run_escan: error (-52)
[ 3005.609162] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-52)
[ 3005.620132] ieee80211 phy0: brcmf_run_escan: error (-52)
[ 3005.625685] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-52)
[ 3349.033428] ieee80211 phy0: brcmf_cfg80211_scan: Connecting: status (3)
[ 3349.040692] ieee80211 phy0: brcmf_cfg80211_scan: Connecting: status (3)
[ 3349.324019] ------------[ cut here ]------------
[ 3349.330137] WARNING: CPU: 0 PID: 262 at net/wireless/sme.c:756 __cfg80211_connect_result+0x41c/0x4d0 [cfg80211]
[ 3349.340546] Modules linked in: ipv6 nf_defrag_ipv6 brcmfmac brcmutil sha256_generic libsha256 cfg80211 rfkill snd_soc_simple_card snd_soc_simple_card_utils snd_soc_max98357a snd_soc_bcm2835_i2s regmap_mmio snd_soc_core snd_compress snd_pcm_dmaengine snd_pcm snd_timer snd
[ 3349.365074] CPU: 0 PID: 262 Comm: kworker/u2:2 Not tainted 5.4.51 #1
[ 3349.371533] Hardware name: BCM2835
[ 3349.376401] Workqueue: cfg80211 cfg80211_event_work [cfg80211]
[ 3349.382516] Backtrace:
[ 3349.385049] [<c00156e8>] (dump_backtrace) from [<c0015a34>] (show_stack+0x20/0x24)
[ 3349.392805]  r7:000002f4 r6:bf10d624 r5:00000009 r4:bf135900
[ 3349.398587] [<c0015a14>] (show_stack) from [<c0736d54>] (dump_stack+0x20/0x28)
[ 3349.406004] [<c0736d34>] (dump_stack) from [<c00239a4>] (__warn+0xd0/0x104)
[ 3349.413150] [<c00238d4>] (__warn) from [<c0023d58>] (warn_slowpath_fmt+0x6c/0xc4)
[ 3349.420765]  r7:bf10d624 r6:000002f4 r5:bf135900 r4:00000000
[ 3349.427938] [<c0023cf0>] (warn_slowpath_fmt) from [<bf10d624>] (__cfg80211_connect_result+0x41c/0x4d0 [cfg80211])
[ 3349.438495]  r8:d8dd6084 r7:d94ebe64 r6:00000000 r5:d8dd6004 r4:d8f2da0c
[ 3349.448017] [<bf10d208>] (__cfg80211_connect_result [cfg80211]) from [<bf0dda00>] (cfg80211_process_wdev_events+0x138/0x1c0 [cfg80211])
[ 3349.460512]  r7:d8dd6024 r6:d8dd6004 r5:80000013 r4:d8f2da00
[ 3349.469003] [<bf0dd8c8>] (cfg80211_process_wdev_events [cfg80211]) from [<bf0ddac8>] (cfg80211_process_rdev_events+0x40/0x98 [cfg80211])
[ 3349.481589]  r10:d88bc0d8 r9:00000000 r8:00000000 r7:d948ae00 r6:00000040 r5:d88bc420
[ 3349.489599]  r4:d8dd6004
[ 3349.494901] [<bf0dda88>] (cfg80211_process_rdev_events [cfg80211]) from [<bf0d71c4>] (cfg80211_event_work+0x24/0x2c [cfg80211])
[ 3349.506686]  r5:c772d600 r4:d88bc0d4
[ 3349.511718] [<bf0d71a0>] (cfg80211_event_work [cfg80211]) from [<c003ddd4>] (process_one_work+0x1c8/0x470)
[ 3349.521648]  r5:c772d600 r4:d88bc0d4
[ 3349.525355] [<c003dc0c>] (process_one_work) from [<c003e0c4>] (worker_thread+0x48/0x52c)
[ 3349.533641]  r10:d940d200 r9:00000088 r8:c0a3c760 r7:d940d214 r6:c772d614 r5:d940d200
[ 3349.541603]  r4:c772d600
[ 3349.544279] [<c003e07c>] (worker_thread) from [<c00434cc>] (kthread+0x120/0x15c)
[ 3349.551812]  r10:d0067e88 r9:d8ef3f98 r8:c772d600 r7:d94ea000 r6:00000000 r5:c502c460
[ 3349.559821]  r4:d8ef3f80
[ 3349.562456] [<c00433ac>] (kthread) from [<c00090ac>] (ret_from_fork+0x14/0x28)
[ 3349.569801] Exception stack(0xd94ebfb0 to 0xd94ebff8)
[ 3349.574990] bfa0:                                     00000000 00000000 00000000 00000000
[ 3349.583349] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 3349.591665] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 3349.598439]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c00433ac
[ 3349.606436]  r4:c502c460
[ 3349.609020] ---[ end trace 53428b45b18f1d66 ]---
[ 3726.022943] ieee80211 phy0: brcmf_cfg80211_scan: Connecting: status (3)
[ 3726.030239] ieee80211 phy0: brcmf_cfg80211_scan: Connectinghttps://www.youtube.com/: status (3)
[ 3726.314103] ------------[ cut here ]------------
[ 3726.320236] WARNING: CPU: 0 PID: 262 at net/wireless/sme.c:756 __cfg80211_connect_result+0x41c/0x4d0 [cfg80211]
[ 3726.330648] Modules linked in: ipv6 nf_defrag_ipv6 brcmfmac brcmutil sha256_generic libsha256 cfg80211 rfkill snd_soc_simple_card snd_soc_simple_card_utils snd_soc_max98357a snd_soc_bcm2835_i2s regmap_mmio snd_soc_core snd_compress snd_pcm_dmaengine snd_pcm snd_timer snd
[ 3726.355180] CPU: 0 PID: 262 Comm: kworker/u2:2 Tainted: G        W         5.4.51 #1
[ 3726.363093] Hardware name: BCM2835
[ 3726.367928] Workqueue: cfg80211 cfg80211_event_work [cfg80211]
[ 3726.373983] Backtrace:
[ 3726.376518] [<c00156e8>] (dump_backtrace) from [<c0015a34>] (show_stack+0x20/0x24)
[ 3726.384275]  r7:000002f4 r6:bf10d624 r5:00000009 r4:bf135900
[ 3726.390113] [<c0015a14>] (show_stack) from [<c0736d54>] (dump_stack+0x20/0x28)
[ 3726.397538] [<c0736d34>] (dump_stack) from [<c00239a4>] (__warn+0xd0/0x104)
[ 3726.404673] [<c00238d4>] (__warn) from [<c0023d58>] (warn_slowpath_fmt+0x6c/0xc4)
[ 3726.412331]  r7:bf10d624 r6:000002f4 r5:bf135900 r4:00000000
[ 3726.419466] [<c0023cf0>] (warn_slowpath_fmt) from [<bf10d624>] (__cfg80211_connect_result+0x41c/0x4d0 [cfg80211])
[ 3726.430020]  r8:d8dd6084 r7:d94ebe64 r6:00000000 r5:d8dd6004 r4:c5343a0c
[ 3726.439551] [<bf10d208>] (__cfg80211_connect_result [cfg80211]) from [<bf0dda00>] (cfg80211_process_wdev_events+0x138/0x1c0 [cfg80211])
[ 3726.452052]  r7:d8dd6024 r6:d8dd6004 r5:80000013 r4:c5343a00
[ 3726.460498] [<bf0dd8c8>] (cfg80211_process_wdev_events [cfg80211]) from [<bf0ddac8>] (cfg80211_process_rdev_events+0x40/0x98 [cfg80211])
[ 3726.473127]  r10:d88bc0d8 r9:00000000 r8:00000000 r7:d948ae00 r6:00000040 r5:d88bc420
[ 3726.481129]  r4:d8dd6004
[ 3726.486396] [<bf0dda88>] (cfg80211_process_rdev_events [cfg80211]) from [<bf0d71c4>] (cfg80211_event_work+0x24/0x2c [cfg80211])
[ 3726.498184]  r5:c772d600 r4:d88bc0d4
[ 3726.503264] [<bf0d71a0>] (cfg80211_event_work [cfg80211]) from [<c003ddd4>] (process_one_work+0x1c8/0x470)
[ 3726.513197]  r5:c772d600 r4:d88bc0d4
[ 3726.516863] [<c003dc0c>] (process_one_work) from [<c003e0c4>] (worker_thread+0x48/0x52c)
[ 3726.525151]  r10:d940d200 r9:00000088 r8:c0a3c760 r7:d940d214 r6:c772d614 r5:d940d200
[ 3726.533151]  r4:c772d600
[ 3726.535756] [<c003e07c>] (worker_thread) from [<c00434cc>] (kthread+0x120/0x15c)
[ 3726.543328]  r10:d0067e88 r9:d8ef3f98 r8:c772d600 r7:d94ea000 r6:00000000 r5:c502c460
[ 3726.551332]  r4:d8ef3f80
[ 3726.553933] [<c00433ac>] (kthread) from [<c00090ac>] (ret_from_fork+0x14/0x28)
[ 3726.561319] Exception stack(0xd94ebfb0 to 0xd94ebff8)
[ 3726.566462] bfa0:                                     00000000 00000000 00000000 00000000
[ 3726.574824] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 3726.583181] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 3726.589916]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c00433ac
[ 3726.597913]  r4:c502c460
[ 3726.600531] ---[ end trace 53428b45b18f1d67 ]---
[ 4075.415726] ieee80211 phy0: brcmf_cfg80211_scan: Connecting: status (3)
[ 4075.423088] ieee80211 phy0: brcmf_cfg80211_scan: Connecting: status (3)
[ 4075.707740] ------------[ cut here ]------------
[ 4075.713868] WARNING: CPU: 0 PID: 297 at net/wireless/sme.c:756 __cfg80211_connect_result+0x41c/0x4d0 [cfg80211]
[ 4075.724269] Modules linked in: ipv6 nf_defrag_ipv6 brcmfmac brcmutil sha256_generic libsha256 cfg80211 rfkill snd_soc_simple_card snd_soc_simple_card_utils snd_soc_max98357a snd_soc_bcm2835_i2s regmap_mmio snd_soc_core snd_compress snd_pcm_dmaengine snd_pcm snd_timer snd
[ 4075.748795] CPU: 0 PID: 297 Comm: kworker/u2:1 Tainted: G        W         5.4.51 #1
[ 4075.756666] Hardware name: BCM2835
[ 4075.761541] Workqueue: cfg80211 cfg80211_event_work [cfg80211]
[ 4075.767595] Backtrace:
[ 4075.770129] [<c00156e8>] (dump_backtrace) from [<c0015a34>] (show_stack+0x20/0x24)
[ 4075.777886]  r7:000002f4 r6:bf10d624 r5:00000009 r4:bf135900
[ 4075.783669] [<c0015a14>] (show_stack) from [<c0736d54>] (dump_stack+0x20/0x28)
[ 4075.791085] [<c0736d34>] (dump_stack) from [<c00239a4>] (__warn+0xd0/0x104)
[ 4075.798226] [<c00238d4>] (__warn) from [<c0023d58>] (warn_slowpath_fmt+0x6c/0xc4)
[ 4075.805843]  r7:bf10d624 r6:000002f4 r5:bf135900 r4:00000000
[ 4075.813019] [<c0023cf0>] (warn_slowpath_fmt) from [<bf10d624>] (__cfg80211_connect_result+0x41c/0x4d0 [cfg80211])
[ 4075.823577]  r8:d8dd6084 r7:d8ea1e64 r6:00000000 r5:d8dd6004 r4:d8ea660c
[ 4075.833125] [<bf10d208>] (__cfg80211_connect_result [cfg80211]) from [<bf0dda00>] (cfg80211_process_wdev_events+0x138/0x1c0 [cfg80211])
[ 4075.845621]  r7:d8dd6024 r6:d8dd6004 r5:80000013 r4:d8ea6600
[ 4075.854111] [<bf0dd8c8>] (cfg80211_process_wdev_events [cfg80211]) from [<bf0ddac8>] (cfg80211_process_rdev_events+0x40/0x98 [cfg80211])
[ 4075.866698]  r10:d88bc0d8 r9:00000000 r8:00000000 r7:d948ae00 r6:00000040 r5:d88bc420
[ 4075.874702]  r4:d8dd6004
[ 4075.879969] [<bf0dda88>] (cfg80211_process_rdev_events [cfg80211]) from [<bf0d71c4>] (cfg80211_event_work+0x24/0x2c [cfg80211])
[ 4075.891798]  r5:c772d5a0 r4:d88bc0d4
[ 4075.896834] [<bf0d71a0>] (cfg80211_event_work [cfg80211]) from [<c003ddd4>] (process_one_work+0x1c8/0x470)
[ 4075.906765]  r5:c772d5a0 r4:d88bc0d4
[ 4075.910472] [<c003dc0c>] (process_one_work) from [<c003e0c4>] (worker_thread+0x48/0x52c)
[ 4075.918757]  r10:d940d200 r9:00000088 r8:c0a3c760 r7:d940d214 r6:c772d5b4 r5:d940d200
[ 4075.926717]  r4:c772d5a0
[ 4075.929359] [<c003e07c>] (worker_thread) from [<c00434cc>] (kthread+0x120/0x15c)
[ 4075.936891]  r10:d0067e88 r9:d8fa4b98 r8:c772d5a0 r7:d8ea0000 r6:00000000 r5:c502c6c0
[ 4075.944892]  r4:d8fa4b80
[ 4075.947525] [<c00433ac>] (kthread) from [<c00090ac>] (ret_from_fork+0x14/0x28)
[ 4075.954872] Exception stack(0xd8ea1fb0 to 0xd8ea1ff8)
[ 4075.960063] 1fa0:                                     00000000 00000000 00000000 00000000
[ 4075.968425] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 4075.976743] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 4075.983516]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c00433ac
[ 4075.991514]  r4:c502c6c0
[ 4075.994097] ---[ end trace 53428b45b18f1d68 ]---

I'm seeing the same issue with my Raspberry PI Zero W.

Linux luca1 5.4.51+ #1327 Thu Jul 23 10:53:06 BST 2020 armv6l GNU/Linux
brcmfmac: brcmf_c_preinit_dcmds: Firmware: BCM43430/1 wl0: Oct 22 2019 01:59:28 version 7.45.98.94 (r723000 CY) FWID 01-3b33decd

I decided to do more debugging myself using modprobe brcmfmac debug=0x14dd36 and it seems I was able to capture the moment wifi stopped working. https://gist.github.com/riptidewave93/787ccd6ef50a7bb0f804d330d0dff33c

Note this was on Linux embedded 5.7.9 #1 Sat Aug 8 13:21:12 CDT 2020 armv6l GNU/Linux which is based off of the rpi 5.7 branch as of commit https://github.com/raspberrypi/linux/commit/95e191414d6915bd44d794e679d8400811ee5a5f

From the gist, you can see that wifi started failing around 330.527497 when brcmf_sdio_bus_watchdog is first referenced. After that, you see that txdata slows down to almost nothing and multiple calls over and over to brcmf_sdio_bus_watchdog. Digging in the code, this function is called by https://github.com/raspberrypi/linux/blob/rpi-5.7.y/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c#L4045-L4069 It's also worth noting this watchdog code, according to git blame, was last changed 6~ years ago.

This makes me think there may be an issue with the SDIO bus, but I am personally not skilled enough to dig much deeper than this. Could this maybe be a clock issue?

@pelwell Would love your thoughts on this one :sweat_smile:

I thought it worth mentioning, although this is not a long-term solution, but for anyone who is looking for a workaround:

If you have already upgraded your WiFi firmware, try:
pi@raspberrypi:~ $ wget http://archive.raspberrypi.org/debian/pool/main/f/firmware-nonfree/firmware-brcm80211_20190114-1+rpt4_all.deb
pi@raspberrypi:~ $ sudo dpkg -i firmware-brcm80211_20190114-1+rpt4_all.deb
pi@raspberrypi:~ $ sudo reboot

If you haven't upgraded your firmware, but would like to continue with the latest OS updates:
pi@raspberrypi:~ $ sudo apt update
pi@raspberrypi:~ $ sudo apt list --upgradeable | grep firmware-brcm80211

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

firmware-brcm80211/testing 1:20190114-1+rpt7 all [upgradable from: 1:20190114-1+rpt4]

Note you will see above the firmware version that would otherwise be installed, then:
pi@raspberrypi:~ $ sudo apt-mark hold firmware-brcm80211

And check that it was successful:
pi@raspberrypi:~ $ apt-mark showhold
firmware-brcm80211

Now it is safe to perform a full upgrade leaving the WiFi function intact:
pi@raspberrypi:~ $ sudo apt -y upgrade

If at anytime it is necessary to un-set the hold on the package to do more testing, etc:
pi@raspberrypi:~ $ sudo apt-mark unhold firmware-brcm80211

I can confirm through quite extensive testing that 20190114-1+rpt4 package version seems very stable with hostapd and other functions. For now it seems to be working fine with the latest kernel.

Per @jeremyn54, this seems to have helped me. I've been running this for a few days now and so far no drops. I ended upgrading the firmware as well as the kernel.

root@raspberrypi:~# dpkg -l |grep firmware-brcm80211
ii  firmware-brcm80211                    1:20190114-1+rpt7                      all          Binary firmware for Broadcom/Cypress 802.11 wireless cards
Linux raspberrypi 5.4.51-v7+ #1327 SMP Thu Jul 23 10:58:46 BST 2020 armv7l GNU/Linux
ii  raspberrypi-kernel                    1.20200723-1                           armhf        Raspberry Pi bootloader

Hopefully this helps others. I'll post back if I get any lockups/drops. I'm running it in AP mode.

Based on what was shared by @jeremyn54 and @robgil, I extracted the firmware blobs from both of the mentioned raspbian packages:

7.45.98.38 - 20190114-1+rpt4
7.45.98.94 - 20190114-1+rpt7

And on my kernel, Linux buildroot 5.7.9 #1 Mon Aug 10 19:06:58 CDT 2020 armv6l GNU/Linux, I am still seeing the WiFi crash when SCPing large files to the Pi Zero as mentioned earlier.

There is a promising feature "reset SDIO bus on a firmware crash" in the upcoming Linux 5.9.

There is a promising feature "reset SDIO bus on a firmware crash" in the upcoming Linux 5.9.

Sadly I cherry picked this and tested it, as well as a handful of other upcoming patches for 5.9 with no success. The issue doesn't seem to be a firmware crash, but something actually wrong with the SDIO bus from my testing. Really wish this issue would get more eyes from RaspberryPi.

As an update to the issue, it seems the cause of the crash, at least in my case, is due to my Pi Zero being connected to a network that has 802.11r fast roaming enabled. If I reconnect to a non 802.11r network, I do not have connectivity issues. I have tested with roamoff=1 as well as roamoff=0, and I can always re-create the driver issue during an inbound SCP to the device. Since roamoff has no impact on the issue, this leads me to think the issue is within the brcmfmac driver on handling 802.11r networks.

I can confirm that disabling fast roaming in my AP worked around the problem. I haven't seen connectivity drop ever since.

@jaroslawprzybylowicz I am trying to get more information on what may be causing the issue. Care if I ask what type of AP you are using, and what type of radios it has?

I am personally using a few Ubiquiti Unifi NanoHDs, which use the MediaTek MT7603 for the B/G/N radio.

was using a avm fritz!box 7412 with original firmware. for hw details of the device see openwrt page for the device. As mentioned earlier I had the impression it happens mostly when an android device (v4/5/6 maybe also newer ones) accessed a octoprint website on the pi. It also happened randomly over time.
Some more details in my original comment. It however is maybe end device or network traffic dependent but guess not ap or radio dependent.

09.09.2020 00:04:45 Chris Blake notifications@github.com:

@jaroslawprzybylowicz[https://github.com/jaroslawprzybylowicz] I am trying to get more information on what may be causing the issue. Care if I ask what type of AP you are using, and what type of radios it has?

I am personally using a few Ubiquiti Unifi NanoHDs, which use the MediaTek MT7603 for the B/G/N radio.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub[https://github.com/raspberrypi/linux/issues/1342#issuecomment-689161037], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AAZQPLVVYADHKXZBEPUM2GDSE2S7ZANCNFSM4B52SC4Q]. [https://github.com/notifications/beacon/AAZQPLRFN5PNTBNB5AMG6Z3SE2S7ZA5CNFSM4B52SC42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOFEJ4GTI.gif]

@riptidewave93 My setup is single UniFi AP-AC-Pro with Qualcomm Atheros QCA9563 onboard. It has both 2.4 and 5GHz radios enabled under the same SSID.

For what it's worth, I'm using a TP-Link AC-1750 which has 2.4ghz and 5ghz on different ssids. And I also have only observed the issue when connecting from an android device

So on my pi 3B the wifi doesn't die after a while, it doesn't even start up anymore. Here is the output with the debug flag enabled: https://gist.github.com/pentlander/d37fa273f955ac988f71342c47109d28

Was this page helpful?
0 / 5 - 0 ratings