-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BLE Connection fails to establish between two nRF52840-USB Dongles with Zephyr controller #29008
Comments
@RoyAnupam please provide the Zephyr commit hash you are building the hci_usb sample. |
Hello @cvinayak , BR, |
@RoyAnupam I have been busy. But I have tested one hci_usb nRF52840 dongle connected to my virtualbox with linux, and setup gatt services as stated by you. I do not see any connection supervision timeouts when I connect to it from a phone and when I connect back from Linux to phone, too.
Do you mean, hci_usb as a peripheral and a phone as central, you do not have connection loss? Could you try with different nRF52840 dongles? (in case you may have damaged the crystals on your dongles). You can contact me on slack and we can have a screen sharing, if you want me to debug live. |
@cvinayak , I appreciate your time to test & respond about this issue.
Yes, I don't face any connection loss between hci_usb as peripheral & android phone as client & vice versa. Basically, when only one hci_usb(dongle) is involved and other device is commercial product like phone, there is no connection drop. But, whenever, I use hci_usb in both sides, connection drop issue is occurring.
Yes, I have tried with two different dongles, behavior is same. Could you get a chance to check the scenario with two hci_usb dongles once?
I have never used slack, but will check if my company policy allows to use it for screen sharing. Thank You |
@RoyAnupam Just an FYI, I edited your comment, you need a blank line after quoting someone.
|
hello @cvinayak ,
oh, I will check your btmon logs in detail and try to compare with mine. Thank you very much for sharing the logs. |
You meant My bluez version is 5.55. BlueZ should not cause connection timeout. If you can provide detailed step by step commandline or a script, it will make it easy to reproduce your issue. |
Hello @cvinayak
Okay, sure
Okay, sure. Let me work on this and share with you details. |
Hello @cvinayak From your description, if I understood correctly,,, you are using single PC with Virtual Box with two nRF dongles with hci_usb attached to the same PC? In the mean time, just for your info, my setup is following.
Setup2:
|
Yes. and I used |
Hello @cvinayak ,
Okay, got it. I will also try this scenario at my end. I reproduced the issue with controllers in separate Ubuntu Host's(with each running bluez upstream v5.55). Attachments contain following
I hope these information's will be useful to reproduce the scenario. By the way, I would also like to share hci_usb prj.cnf file changes. Please let me know if any further information would be helpful. Thank You |
@RoyAnupam Did you forget to upload files? (you can drag/drop or paste zip files here in the comments) |
Hello @cvinayak , |
@RoyAnupam you can also upload the files to dropbox or any other service and give us a link to them here. |
@carlescufi , |
Hello @cvinayak @carlescufi Attachments contain following
Please let me know if you face any issue in accessing these files. Thanks Linux_Client_bluetoothd.txt |
From Linux_Client_btmon.txt:
From Linux_Peripheral_btmon.txt:
The last ATT Read Request handle (0x0012 sent, and 0x0003 received) do not match, are you sure this log is from a connection between the two dongles? |
Hello @cvinayak ,
Yes, I cross-checked, the logs are from two dongles running on Bluez 5.55 host of two separate Ubuntu Machines. I have captured logs again. This time, along with btmon logs, I am also sharing cfa files which can be directly viewed in Frontline's Comprobe Protocol Analyzer system (CPAS) file viewer software. Below are the Random addresses of the dongles:-
Client connection initiation (can be found in Client_btmon.txt) Peripheral connection Complete (can be found in Peripheral_btmon.txt)
Please find attached the btmon and cfa logs |
@RoyAnupam could you please try to capture a sniffer trace using the Nordic sniffer, if you have spare Nordic Development Kit. You can find the instructions here: |
@cvinayak FYI this is Connection Timeout (0x08) on both sides, so likely a controller issue. |
Hello @carlescufi ,
Unfortunately, I don't have a Nordic Sniffer. I am sorry about it. However, I did try to use a Frontline Comprobe Air Sniffer. In the logs, I remember, there was not much communication happening After CONN_IND PDU is sent by the initiator. |
Please give a try with the follow in prj.conf of the hci_usb sample. |
@RoyAnupam select in the Wireshark menus->view->Interface Toolbars-> "nRF Sniffer for Bluetooth LE" Now you have a toolbar with "Device" as "All advertising devices", drop this down and select the advertising device you which to capture the connection for. |
Hello @cvinayak , |
@RoyAnupam please attach the sniffer log file with information from connection to disconnection. From the screenshot of few seconds of connection, there is nothing much I can analyze. Did you get the disconnection with supervision timeout (packets stop in the sniffer after only packet from master for the duration of supervision timeout)? |
Hello @cvinayak While I was testing yesterday, I got disconnection once after around 3-4~ mins of connection, but at that time, wireshark filter settings was not correct, so unfortunately that log is not useful. I will test again and share my observation and correct sniffer log. |
Hello @cvinayak ,
Yes, the last packet was from Master at time : 14399.221, after that, there was nothing from Slave, due to which Connection terminated. I can see Peripheral device switched to Advertisement state after connection was terminated at time:14401.047. Anyways, please refer screenshot for the same. |
It just looks like Peripheral did a proper local initiated disconnect to which the Central acknowledged in packet no. 476411, except that the Info says L2CAP Fragment (is the connection encrypted?). Without the sniffer logs with details to look at, I cant be certain.... Can you email me the sniffer log zip file? My email address is in any of controller commit messages in Zephyr. |
Hello @cvinayak , |
@RoyAnupam Thank you for the sniffer log, and following up with the requests on this issue. From the time reference set between CRC errors in the master's packets, it appears the 1.28 s advertising interval when colliding with the connection event, may have caused the event counter of the master to have been corrupted. Please take in the fixes in here: #30119 CC @mtpr-ot |
Hello @cvinayak , git project: [email protected]:zephyrproject-rtos/zephyr.git $git branch
git tip commit hash : 4f4dd9f Following Commits are part of latest master git diff output CONFIG_BT=y CONFIG_USB=y Please find below screenshot of the AirSniffer log. Also, please find the Wireshark capture file as attachment. BR, |
Could you do a scan for number of WiFI access points in the vicinity of your setup? From the sniffer log review, there it is a wide spread of CRC errors and packet drops in the range of almost 1-13 WiFI channels overlapping the BLE radio channels. |
Hello @cvinayak ,
I can see list of ~25 AP found in my Android S8+ phone. BR, |
@RoyAnupam From what I can see both devices in the Connection reach supervision timeout. This also happens in the sniffer (the sniffer reaches supervision timeout and starts scanning advertiser channels, confirmed by actually seeing them). Can you try with a longer supervision timeout? See if the connection can resynchronize then? |
Hello @cvinayak & @joerchan , Result: Connection seems to be rock solid! Its almost 23+ hours now FYI, & connection is still ON as I write this email. I am testing under same environment with many AP's as mentioned in trailing thread. Could you suggest why the connection is not stable when testing between two nRF dongles, while it is rock solid in case of nRF + Android Phone, considering same test environment? Note: I have attached sniffer log for the same for the above connection. Please note I could share log only for 12 hours as Ihad to stop capture after ~12 hours, as file size was getting huge. I would be happy to share more info, if needed. Thank You. BR, |
Hello @cvinayak, |
@RoyAnupam No, it is not an issue between dongles, it is the responsibility of the host to perform periodic channel map updates or to increase the connection supervision timeout. As you can see, the phone's host uses periodic channel map updates and is using a supervision timeout of 5 seconds compared to your setup that does not perform any channel map update and uses 420 ms as supervision timeout. |
Hello @cvinayak , BR, |
@RoyAnupam you can hard code your desired channel map in this file: https://github.com/zephyrproject-rtos/zephyr/blob/master/subsys/bluetooth/controller/ll_sw/ull_chan.c |
@RoyAnupam you can also use the bt_le_set_chan_map existing API call instead to avoid having to modify the code at all. |
Hello @cvinayak ,
Thank you for your suggestion I will look into the code & try it. Hello @carlescufi ,
Is this API part of BT HCI USB controller code? Actually, I am using Zephyr HCI USB controller for my nRF52840 USB dongle & Bluez host (Ubuntu PC). Please help me understand if I am missing something. Thank You |
No, but it's an HCI command, so you can use BlueZ to send that command. The command is "LE Set Host Channel Classification command" |
Hello @carlescufi , @cvinayak & @carlescufi , $hcitool cmd 0x08 0x0013 0x00 0x00 0x18 0x00 0x1C 0x00 0x00 0x00 0x58 0x02 0x09 0x00 0x09 0x00 < HCI Command: LE Connection Update (0x08|0x0013) plen 14 Observation: Connection is stable between two nRF dongles and still active after ~5 hours, while I write this post. One observation(off-topic): The nRF sniffer dongle seemed to just die when I updated PHY to use CODED PHY between the nRF dongles. Somehow sniffer stopped working completely and stopped capturing. It was working fine until PHY was 1M & 2M. I was wondering, can it not listen to CODED PHY channel? |
As of latest release 3.1.0, the sniffer does not support LE Coded PHY. |
Ok, thanks for confirming. Does the nRF52840 dongle (sniffer_nrf52840dongle_nrf52840_7cc811f.hex) support Extended Advertising & Extended Connection over 2M PHY? |
@RoyAnupam Based on the conclusion please close this issue. You may create new question issue for further discussions. |
Hello @cvinayak,
Sure, will do. Thanks. |
No, this is planned for the next release (along with LE Coded PHY). |
Hello @joerchan ,
Allright, thank you very much for this valuable information. |
Describe the bug
Observation: LE Connection fails to establish between two nRF52840-USB Dongles
Device Roles:
Reason: Host receives Disconnect Complete with Reason: Connection Timeout
To Reproduce
Steps to reproduce the behavior:
Device A: Create GATT server using bluetoothctl tool (add a GATT server 0x1834, characteristic: 0x2834, Descriptor: 0x2902)
Device B: Scan & connect Device A from bluetoothctl tool
Expected behavior
Connection should not drop.
Impact
Not able to proceed further tests
Logs and console output
HCI Dump log in GATT Central Peripheral (Advertiser)
Environment (please complete the following information):
Additional context
Observation: Sometimes connection drops with few seconds and sometimes in few minutes, without any host involvement or any command being sent to Controller from host. Also, it is observed that, the BLE connection between one nRF52840-USB dongle and another device (Android P) never terminates automatically by Connection Timeout, with nRF as GATT Server and Android as GATT client.
The text was updated successfully, but these errors were encountered: