Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a second recovery path if BLE stack stops being able to receive adverts #20365

Closed
wants to merge 1 commit into from

Conversation

lis0r
Copy link
Contributor

@lis0r lis0r commented Jan 1, 2024

Add a second recovery path if BLE stack stops being able to receive adverts

Description:

I have twice seen the receiving of passive adverts stall, requiring a device restart to resume their reception. The issue is highly intermittent, and I have not found a means of reproducing it at will. I suspect the issue is related to espressif/esp-idf#4001 .

The logs contain a lines stating "BLE: scan stall? no adverts > 120s, restart BLE". The restart of NimBLE is successful, meaning that Tasmota does not restart. However, the restart of NimBLE is not sufficient to rectify the underlying problem, so the message repeats every 2 minutes.

I have adjusted the check so that it will try to restart NimBLE after 120s of no advertisements. After this, if there is still no advertisements after 240s, it will restart Tasmota.

Checklist:

  • The pull request is done against the latest development branch
  • Only relevant files were touched
  • Only one feature/fix was added per PR and the code change compiles without warnings
  • The code change is tested and works with Tasmota core ESP8266 V.2.7.4.9
  • The code change is tested and works with Tasmota core ESP32 V.2.0.14
  • I accept the CLA.

NOTE: The code change must pass CI tests. Your PR cannot be merged unless tests pass

@Jason2866
Copy link
Collaborator

Jason2866 commented Jan 1, 2024

There will be no merge. No driver is allowed to restart Tasmota just when the specific driver fails forever reason. A workaround has to be found, or better a fix in the faulty code.

Looks like espressif dont get this solved h2zero/NimBLE-Arduino#460 (comment)
Maybe more luck with upcoming Arduino core 3.0.0 based on IDF 5.1

@s-hadinger
Copy link
Collaborator

There is one exception though with zigbee. If the MCU goes into an invalid state, there is no choice other than a full restart of both Tasmota and the MCU.

I'm not familiar with BLE. What does not adverts mean? Can it happen if there are no BLE devices in range?

@barbudor
Copy link
Contributor

barbudor commented Jan 1, 2024

I assume this also occurs only when using active scan?
I'm all passive with PVVX'ed Mija and I haven't seen this blocking

Probably can also be done with a rule

@lis0r
Copy link
Contributor Author

lis0r commented Jan 1, 2024

It is definitely not a case of no devices in range - my house is a BLE heavy environment. The node in question normally reports 13 devices, and there are more devices just slightly out of range. From what I can see, the reason it finds zero devices is because the BLE controller has taken an error so severe that even restarting the BLE stack is insufficient to fix it. Manually restarting Tasmota quickly and definitively fixes the BLE stack.

I have active scans disabled to save power, and I operate purely on receiving passive adverts from a number of MJ_HT_V1, MCCGQ02HL, and SJWS01LM devices running their stock firmware. As mentioned, it's highly intermittent - I've been running otherwise OK for months, but it's happened twice within the last few days, with no recent changes to set up or radio environment.

It should be noted that the BLE driver can already restart Tasmota: if restarting NimBLE actively fails or times out, Tasmota will be restarted after 10 seconds. This change is effectively building upon that model to cope with a false positive.

I will investigate rules.

@s-hadinger
Copy link
Collaborator

Let me rephrase my question. What happens if a user enables the driver but has zero BLE device in range? could Tasmota also reboot every 240 seconds?

@Jason2866
Copy link
Collaborator

It should be noted that the BLE driver can already restart Tasmota: if restarting NimBLE actively fails or times out, Tasmota will be restarted after 10 seconds. This change is effectively building upon that model to cope with a false positive.

This part of the code should not be there. Seems it was overlooked in the review.

@Jason2866
Copy link
Collaborator

Jason2866 commented Jan 1, 2024

Iirc this driver is not ready for Arduino core 3.0.0 since it uses h2zero NimBLE-Arduino library. For Arduino core 3.0.0 to support BLE with C2 and C6 too it is needed to switch to h2zero esp-nimble-cpp. To use this library some changes are needed.
Tasmota will move to Arduino core 3.0.0 for all ESP32x builds when it is released.
Support for core 2.0.14 will be dropped and lib NimBLE-Arduino removed.

For doing a core 3.0.0 BLE build add this to platformio_tasmota_core3_env.ini

[env:tasmota32-ble-arduino30]
extends                 = env:arduino30
board                   = esp32
build_unflags           = ${env:arduino30.build_unflags}
build_flags             = ${env:arduino30.build_flags}
                          -DFIRMWARE_BLUETOOTH
                          -DOTA_URL='""'
lib_ignore              = ${env:arduino30.lib_ignore}

The driver https://github.com/arendst/Tasmota/blob/development/tasmota/tasmota_xsns_sensor/xsns_62_esp32_mi.ino is already using esp-nimble-cpp.

@lis0r
Copy link
Contributor Author

lis0r commented Jan 1, 2024

Berry bodge around, for others looking for a similar work around:

last_ads = 0
def ble_trigger(value, trigger, msg)
  if value['adverts'] == last_ads
    tasmota.cmd("restart 1")
  end
  last_ads = value['adverts']
end
tasmota.add_rule("BLE", ble_trigger)

@lis0r lis0r closed this Jan 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants