Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coded phy works (better) with S140 6.1.x #2465

Open
fanoush opened this issue Feb 15, 2024 · 10 comments
Open

coded phy works (better) with S140 6.1.x #2465

fanoush opened this issue Feb 15, 2024 · 10 comments

Comments

@fanoush
Copy link
Contributor

fanoush commented Feb 15, 2024

I am using xaomi thermometer with custom firmware https://github.com/pvvx/ATC_MiThermometer that is able to advertise over coded phy. I can find it in nrfconnect on my phone but I could never find it with Espruino on 52840 (with S140 6.0.0).

Now I tried with 6.1.1 (and 6.1.0) and just by updating softdevice it works with same espruino binary - suddenly the thermometer can be found!

with 6.0.0

NRF.findDevices(function(devices) { console.log(devices);}, { timeout : 7000, active : true, phy:"coded"});
=undefined
[  ]

with 6.1.1

>NRF.findDevices(function(devices) { console.log(devices);}, {timeout : 7000, active : true, phy:"coded"});
=undefined
[
  BluetoothDevice: {
    "id": "a4:c1:38:c9:52:1e public",
    "rssi": -44,
    "data": new Uint8Array([2, 1, 6, 18, 22, 26, 24, 30, 82, 201, 56, 193, 164, 235, 8, 245, 22, 237, 10, 85, 231, 4, 11, 9, 76, 89, 87, 83, 68, 48, 51, 45, 49, 69]).buffer,
    "name": "LYWSD03-1E",
    "serviceData": {
      "181a": new Uint8Array([30, 82, 201, 56, 193, 164, 235, 8, 245, 22, 237, 10, 85, 231, 4]).buffer
     }
   }
 ]

I checked S140 6.1.0 release notes and it actually starts with

The main new feature for s140_nrf52_6.1.0 compared to s140_nrf52_6.0.0 is the full support for all mandatory LE Advertising Extensions features and qualified LE Coded PHY feature.

the "new features" section also mentions

  • Qualified LE Coded PHY feature (DRGN-5702).
  • Qualified LE Advertising Extensions feature (DRGN-7504)
  • The scanner and initiator roles can now be configured to receive ADV_EXT_IND PDUs on both 1M and Coded PHY, using
    a single call to sd_ble_gap_scan_start() or sd_ble_gap_connect() (DRGN-8668).
  • It is now possible to send and receive advertising packets with up to 255 bytes of payload (DRGN-9315).
  • Privacy for Advertising Extensions is fully supported (DRGN-9340).
  • The SoftDevice is now able to receive chained advertisements (DRGN-9734).
  • The SoftDevice is now able to send chained advertisements. The advertising data fragmentation is handled autonomously
    by the SoftDevice (DRGN-9802)

There is quite a lot of bugfixes and there is also migration document with code describing how to scan on both PHYs at once (basically double the interval) I am attaching both here
s140_nrf52_6.1.0_migration-document.pdf
s140_nrf52_6.1.0_release-notes.pdf

I also noticed one mentioned change which may be related to the 4096 block writing bug to internal flash, looks like maybe they reduced it too much :-)

The time reserved by the SoftDevice is reduced by 297 µs when performing a flash word write, and by 4.7 ms when
performing a flash page erase. This increases the probability of successfully scheduled flash operations (DRGN-9048).

So basically I am suggesting to eventually move to 6.1.0 or 6.1.1 to support coded phy/advertising extensions better. The only issue I see that need fixing with 6.1.x is reducing flash write block size from 4096 to 2048 as per that comment.

@fanoush
Copy link
Contributor Author

fanoush commented Feb 15, 2024

BTW, it is probably about extended advertising since even with passive scan the data array returned is 34 bytes so over normal advertising limit of 31 bytes

>NRF.findDevices(function(devices) { console.log(devices);}, {timeout : 2000, active : false, phy:"coded"});
=undefined
[
  BluetoothDevice: {
    "id": "a4:c1:38:c9:52:1e public",
    "rssi": -55,
    "data": new Uint8Array([2, 1, 6, 18, 22, 26, 24, 30, 82, 201, 56, 193, 164, 122, 9, 66, 20, 217, 10, 82, 136, 4, 11, 9, 76, 89, 87, 83, 68, 48, 51, 45, 49, 69]).buffer,
    "name": "LYWSD03-1E",
    "serviceData": {
      "181a": new Uint8Array([30, 82, 201, 56, 193, 164, 122, 9, 66, 20, 217, 10, 82, 136, 4]).buffer
     }
   }
 ]
>

so I guess that is why the https://github.com/espruino/BangleApps/tree/master/apps/shownearby works fine with 6.0.0. and the thermometer is not visible.

@gfwilliams
Copy link
Member

Ahh, that's interesting - thanks! I wonder if there isn't something else that could be done to help us receive bigger advertising on the 6.0.0 softdevice, if that's the underlying issue?

Sorry but just to check, you say you swapped the softdevice hex and made no other changes to the compile and it was all ok?

Definitely moving to the new SD would be good if it does improve the situation - but I'm a bit anxious about doing OTA updates for Bangle.js given how much grief I get from users every time there's even a minor app update :(

@fanoush
Copy link
Contributor Author

fanoush commented Feb 19, 2024

Sorry but just to check, you say you swapped the softdevice hex and made no other changes to the compile and it was all ok?

yes except that I am building with 2048 instead of 4096 as mentioned here #2000 (comment) so that internal flash writing is stable on both.

Then you can upgrade/downgrade softdevices and it works on both. So we actually don't need to force people to upgrade. But if we put 6.1.1 to the repo and build with that then 6.0.0 won't be tested much.

I wonder if there isn't something else that could be done to help us receive bigger advertising

Not sure if nrf52 to nrf52 (i.e. Bangle2 to Bangle2) could do better with 6.0.0 or it s not there at all, that is another thing that is mentioned in 6.1.0 release notes (It is now possible to send and receive advertising packets with up to 255 bytes of payload). But for now we actually can't advertise more so we cannot check, so far we only have static buffers for 31 bytes here anyway
https://github.com/espruino/Espruino/blob/master/targets/nrf5x/bluetooth.c#L2647
maybe we should change that to locked flat buffer string variables (?) as both can be up to 255 with extended advertising which is quite a lot for static data.

And btw did scanning with phy set to "both" or "2mbps" worked for you at some point? like
NRF.findDevices(function(devices) { console.log(devices);}, {timeout : 2000, active : false, phy:"both"});
I just always get some errors (0x7 INVALID_PARAM, 0x8 INVALID_STATE).

However with 6.1.x I tried to redefine "both" to use BLE_GAP_PHY_1MBPS|BLE_GAP_PHY_CODED like this

      } else if (jsvIsStringEqual(advPhy,"both")) {
        m_scan_param.scan_phys = BLE_GAP_PHY_1MBPS|BLE_GAP_PHY_CODED;
      if (m_scan_param.interval<m_scan_param.window*2) m_scan_param.interval=m_scan_param.window*2;
        m_scan_param.extended = 1;
      }

with interval being twice the window as suggested in migration guide pdf and this works very nice, I get one list with both normal and coded phy advertisements when scanning.

@gfwilliams
Copy link
Member

Thanks - I've just moved to 2048b writes - probably a good idea anyway. Does that affect DFU do you know? If that was writing 4096b then we might be in a situation where you wrote the new softdevice but the device was then unable to update itself? I guess mostly DFU works in 2k blocks so maybe there is no point where it writes more.

maybe we should change that to locked flat buffer string variables

That's a tricky one, because when you reset()/load() we rip everything down and put it back, so the flat buffer may get moved. Might have to think about that - for sending advertisements we kind of work around that already, but for receiving I guess we just have to ensure we disable reception..

did scanning with phy set to "both" or "2mbps" worked for you at some point?

I'm not honestly sure - I found it quite hard to test when I was doing it, so maybe it never worked.

@fanoush
Copy link
Contributor Author

fanoush commented Feb 20, 2024

I'm not honestly sure - I found it quite hard to test when I was doing it, so maybe it never worked.

So I was testing this and changed the code here https://github.com/espruino/Espruino/blob/master/targets/nrf5x/bluetooth.c#L2990 so all three bits could be set independently

#if NRF_SD_BLE_API_VERSION>5
      if (jsvObjectGetBoolChild(options, "extended"))
        m_scan_param.extended = 1;
      JsVar *advPhy = jsvObjectGetChildIfExists(options, "phy");
      if (jsvIsUndefined(advPhy)) {
        // default
      } else {
        char phys[18]; // up to "1mbps,2mbps,coded"
        size_t len=jsvGetString(advPhy,phys,18);
        phys[len]=0;
        int phycount=0;
        if (strstr(phys,"2mbps")!=NULL) {
          m_scan_param.scan_phys |= BLE_GAP_PHY_2MBPS;
          m_scan_param.extended = 1;
          phycount++;
        }
        if (strstr(phys,"1mbps")!=NULL) {
          m_scan_param.scan_phys |= BLE_GAP_PHY_1MBPS;
          phycount++;
        }
        if (strstr(phys,"coded")!=NULL) {
          m_scan_param.scan_phys |= BLE_GAP_PHY_CODED;
          m_scan_param.extended = 1;
          phycount++;
        }
        if (strstr(phys,"both")!=NULL) {
          m_scan_param.scan_phys = BLE_GAP_PHY_1MBPS|BLE_GAP_PHY_CODED;
          m_scan_param.extended = 1;
          phycount=2;
        }
        if (phycount==0)
          jsWarn("Unknown phy %q\n", advPhy);
        else 
          if (m_scan_param.interval<m_scan_param.window*phycount) m_scan_param.interval=m_scan_param.window*phycount;
      }
      jsvUnLock(advPhy);
#endif

and the result is that very few combinations work , only

  • BLE_GAP_PHY_1MBPS
  • BLE_GAP_PHY_CODED
  • BLE_GAP_PHY_1MBPS|BLE_GAP_PHY_CODED
    so that generic code above probably does not makes sense to add. What does not work for me when scanning (I get Uncaught Error: ERR 0x7 (INVALID_PARAM) (:2085)) is
  • BLE_GAP_PHY_2MBPS alone or with BLE_GAP_PHY_1MBPS or with BLE_GAP_PHY_CODED
  • all three

According to this https://infocenter.nordicsemi.com/topic/com.nordic.infocenter.s140.api.v6.1.1/structble__gap__scan__params__t.html?cp=5_7_4_4_2_1_4_10_6#a369859ca03e34c8220452ae95d1aa02f (while the explanation is a bit confusing) I understand it like this

  • BLE_GAP_PHY_1MBPS = only 1mbps is scanned, if extended =1 then maybe also 2mbps is scanned as secondary(?)
  • BLE_GAP_PHY_CODED = only coded is scanned
  • BLE_GAP_PHY_1MBPS|BLE_GAP_PHY_CODED = 1mbit and coded is scanned, 2mbit probably not(?)

Maybe some of those combinations may make sense only when connecting, not scanning (as per documentation)

I am still not sure about extending the window when scanning more phys. Why for coded the interval should be window*2 as per migration guide and why not for 1+2mbps. I don't know why it would not simply switch PHY in next rounds since e.g. window 100,interval 100, timeout 2000 should cycle through three advertising channels every 100 units so why not change phy then and do it again (in same way for 1mbps+coded and 1+2mbps.

Anyway, thank for changing the block size. As for the scanning stuff mentioned above I think it would make sense to redefine phy: "both" to be BLE_GAP_PHY_1MBPS|BLE_GAP_PHY_CODED and not sure what to do with "2mbps". If it does not work for you too the maybe changing it to BLE_GAP_PHY_1MBPS and setting extended=1 would scan for it too. Or just remove it as the extended flag has its own setting.

Or if "both" is not descriptive enough the code above can work too with 2mbps option removed, but that would break current documentation.

I also looked into setAdvertising with coded phy (including enabling connectable:true) and 2mbps but that is for another comment a bit later (or another issue) :-)

Does that affect DFU do you know? I

I never had any issue with DFU update on 6.1.x so maybe as you say it does not use such big block.

@fanoush
Copy link
Contributor Author

fanoush commented Feb 20, 2024

And BTW the double interval size when scanning for both phys on 6.1.x is needed, I have added overriding window/interval like

      jsvUnLock(advPhy);
#endif
      uint32_t scan_window=jsvObjectGetIntegerChild(options, "window");
      if (scan_window>=4 && scan_window<=16384) m_scan_param.window=scan_window;
      uint32_t scan_interval=jsvObjectGetIntegerChild(options, "interval");
      if (scan_interval>=4&& scan_interval<=16384) m_scan_param.interval=scan_interval;
      if (m_scan_param.interval<m_scan_param.window) m_scan_param.interval=m_scan_param.window;
    }

And when setting scanning to both 1mbps and coded and set interval to less than 2* window (like window:100, interval:199) I get error unless interval is 200 and up. this does not help when trying to set both 1 and 2mbps - i get error all the time for that

@gfwilliams
Copy link
Member

I think it would make sense to redefine phy: "both" to be BLE_GAP_PHY_1MBPS|BLE_GAP_PHY_CODED

You did mention under "...very few combinations work , only..." that all 3 work, and I believe that's what BLE_GAP_PHYS_SUPPORTED does which is what we use for both? BLE_GAP_PHY_1MBPS | BLE_GAP_PHY_2MBPS | BLE_GAP_PHY_CODED

But maybe that should be all? both is a bit confusing if it's actually BLE_GAP_PHY_1MBPS|BLE_GAP_PHY_CODED but we could maybe add 1mbpscoded or someting?

Good idea about specify window and interval, I'll add that - I think it needs a MSEC_TO_UNITS around it?

@fanoush
Copy link
Contributor Author

fanoush commented Feb 22, 2024

Good idea about specify window and interval, I'll add that - I think it needs a MSEC_TO_UNITS around it?

yes, was just quick hack to have something there as any number is good no matter the unit.

also was thinking why we have everything BLE related in milliseconds when the (((TIME) * 1000) / (625)) conversion makes it inexact and there is extra multiplication and division bloating the code but yes people are probably used to miliseconds :-)

I believe that's what BLE_GAP_PHYS_SUPPORTED does which is what we use for both?

Yes but that actually always gives me error. As it is now I get error for "both" and also for "2mbps" . So only "coded" or "1mbps" worked for me. that's why I changed "both" to 1mbps+coded Can you try too with unmodified firmware?

@gfwilliams
Copy link
Member

people are probably used to miliseconds

Yes, and it makes more sense if we support other platforms - the inaccuracy is a bit annoying though I know.

Just tested and it doesn't work for me either, so just pushing a change now

@fanoush
Copy link
Contributor Author

fanoush commented Feb 26, 2024

Just to follow up with coded phy vs connectable (and scannable).

By default connectable+coded phy does not work with current build. Found out it is because of NRF_SDH_BLE_GAP_EVENT_LENGTH being 3 in targets/nrf5x/app_config.h. However increasing it to minimum of BLE_GAP_EVENT_LENGTH_CODED_PHY_MIN = 6 needs increasing memory for softdevice. For bangle with 2 centrals and 131 MTU it needs 0x3b70 instead of 0x3660. Then it works - sort of.

Another issue is about connectable+scannable - that I found out is invalid combination for coded phy (or extended advertising) in BLE. So now I remember and know why there is no BLE_GAP_ADV_TYPE_EXTENDED_CONNECTABLE_SCANNABLE_UNDIRECTED constant used here https://github.com/espruino/Espruino/blob/master/targets/nrf5x/bluetooth.c#L2782 so the line has same NONSCANNABLE constant there twice because the other one does not exist. Maybe it is because with extended advertising you can make advertising packet larger so it is not needed and maybe for long range it would also take too long. Anyway more correct would probably be to show error there instead of switching to nonscannable silently. Probably my fault when I was refactoring that part.

And BTW, there is nice info about it here https://devzone.nordicsemi.com/f/nordic-q-a/47073/general-questions-about-notifications-low-level-ble-packets-and-softdevice-phy-connection-interval-connection-event-length-att-mtu-and-dle
Which also says that while larger MTU work the packet limit for coded phy on physical layer is 27 bytes so larger packets get split.

What I get from all of this is that if we want to switch device to be connectable over coded phy (worthwhile even for Bangle 2 IMO, I hate when I leave my phone on a desk, go to kitchen and it gets disconnected from gadgetbridge) the best is probably also lowering MTU and restarting SoftDevice to have smaller memory requirements (i,e same as now) because maybe larger MTU over coded phy is not that great idea anyway. So something like we already have for other cases when softdevice restart is needed. And when switching back to normal mode we may have to restart again and increase MTU (and possibly also decrease NRF_SDH_BLE_GAP_EVENT_LENGTH - which should be also tunable at runtime)

And as the scannable mode is not supported with coded phy+connectable we would probably also need to support extended advertising with larger size to fit everything in without scan response packet. That can be 'hacked' to keep same size by actually joining the m_adv_data and m_scan_rsp_data to be one single bigger buffer instead of

static uint8_t m_adv_data[BLE_GAP_ADV_SET_DATA_SIZE_MAX];
static uint8_t m_scan_rsp_data[BLE_GAP_ADV_SET_DATA_SIZE_MAX]; // 31

and reuse that area more dynamically so even with 1mbps with extended advertising one could make device non scannable and have larger advertising packet instead of scan response.

So far I only tested connectable+coded phy with larger memory requirements, all other stuff is just ideas for now. I know I probably won't have time for this for next 14 days so just updating this with the info I know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants