-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bluetooth: bt_conn: Unable to allocate buffer within timeout #43718
Comments
`att_send_process` doesn't pass the errors up the stack, so if a disconnection happens during the allocation of a TX context, it will leak buffers. Fixes zephyrproject-rtos#43718 Signed-off-by: Jonathan Rico <[email protected]>
@jori-nordic I will assume you will resume discussion based on the work you are doing here jori-nordic@2215bf0 |
@jori-nordic Thank you for tracking this down. Is the fix in your branch likely to be the final solution or will there be other changes before it goes into the main branch? |
I was on vacation, sorry for the delay. @aolowin the fix I have might have to change a bit, because while the bug still exists in latest upstream, it has different symptoms due to the l2cap work that has been done. @hermabe has a PR (https://github.com/zephyrproject-rtos/zephyr/pull/45682/files) to fix some issues that were introduced there, and I'll be trying to reproduce this particular bug today with his fixes applied locally. EDIT: so I haven't been able to reproduce the issue with herman's PR applied locally. I think we can close this when that PR is merged. Could you try to reproduce it after that PR is merged @aolowin ? |
@jori-nordic I'll try and reproduce it after that PR is in. Thanks. |
@aolowin would you mind trying to reproduce the issue with the PR applied before it's merged? This way we would know that this is indeed a fix for this issue. You can check out this branch: https://github.com/hermabe/zephyr/tree/fix/meta_free from the PR. |
I've tested #45682 and it does seem to fix the issue. By keeping the peripheral at the edge of the RF range I was able to force numerous disconnect/reconnect cycles and the peripheral was always able to recover. There were a few warnings:
but no errors. |
Thanks for testing. Could you make sure that these warnings do not prevent the stack from sending data once buffers are available again? i.e. this is a recoverable warning that doesn't require rebooting any of the Zephyr-based devices. |
I think these warnings are fine. The buffer itself is freed after it is sent to the controller, but the metadata is not freed until the callbacks are called after receiving the num_complete event from the controller. The allocation of the buffer blocks, but the allocation of the metadata does not, so if no metadata could be allocated the warning is printed and |
I can confirm that the central_hr was able to receive notifications from the peripheral_hr once connected - regardless of whether the warnings occurred. No reboots required. |
@jori-nordic @hermabe any chance of the fixes for this issue being backported to 2.7? |
Describe the bug
A BLE peripheral can sometimes get into a state where it will no longer send notifications. It seems to occur if connections are repeatedly dropped due to range or antenna issues. It may happen if the connection is lost during discovery but that's a guess.
The following errors are generated:
[00:02:21.702,667] bt_conn: Disconnected while allocating context
[00:02:29.443,542] bt_conn: Unable to allocate buffer within timeout
[00:02:29.443,572] bt_l2cap: Unable to allocate buffer for op 0x12
[00:02:52.495,300] bt_conn: Unable to allocate buffer within timeout
[00:02:52.495,300] bt_att: Unable to allocate buffer for op 0x07
Once it gets into this state only a reboot will fix it. A reset on the central side has no effect.
To Reproduce
Use the central_hr and peripheral_hr sample apps on the nrf52840dk_nrf52840 boards
west build samples/bluetooth/peripheral_hr --build-dir=./build/peripheral_hr -b nrf52840dk_nrf52840
west build samples/bluetooth/central_hr --build-dir=./build/central_hr -b nrf52840dk_nrf52840
For ease of desktop testing it's convenient to use:
CONFIG_BT_CTLR_TX_PWR_MINUS_40=y
on the central_hr device. This allows a disconnect by moving the boards a short distance apart.
Move the boards closer and farther from each other to trigger disconnect/reconnect events and eventually generate the erroneous state.
Expected behavior
A peripheral can cleanly reconnect after a disconnect.
Impact
Serious impact since the peripheral will be unable to communicate without a reboot.
Logs and console output
central_hr:
peripheral_hr:
Environment (please complete the following information):
The text was updated successfully, but these errors were encountered: