10% failure rate of individual DM subsystem ranging calculations

Summary
I'm finding the DM subsystem's individual ranging calculations fail around 10% of the time (in line-of-sight conditions, around one meter apart). Is this expected?
 

Test Setup
I have two nRF5340's running a (slightly) modified version of the "nrf_dm" sample with all of the default Kconfig options and Bluetooth scan/advertising settings, except for the following:

CONFIG_DM_TIMESLOT_QUEUE_COUNT_SAME_PEER=1
CONFIG_LOG_MODE_DEFERRED=y
 

  • I'm doing one-directional ranging and using MAC address filters to know when to start ranging.
  • Environment is a typical office space with 10-20 nearby laptops and a couple of Wi-Fi routers
  • The nRF5340 on the left is the "reflector" and is only advertising.
  • The nRF5340 on the right is the "initiator" and is only scanning.
  • Most importantly: The application logic ensures only one ranging calculation happens every 3 seconds:
    • the "reflector" turns on advertising (timeout = 1000ms) and starts a 3 second timer
    • the "reflector" stops advertising after the first call to dm_request_add()
    • when the timer expires, the "reflector" starts advertising again and the process repeats
       
Test Results
I get slightly different results depending on which device's log output I analyze. You can see from the data below that I tracked the failure types to help isolate ranging failures and ignore synchronization failures (e.g. "reflector" advertisement was never scanned by the "initiator").
 
From the "initiator" perspective:
Total Failures: 111
  Ranging Failures: 58    i.e. data_ready() never called
  CRC Failures: 15        i.e. Quality == "crc fail"
  Scanning Failures: 38   i.e. scan_filter_match()/data_cb() never called
Ranging Requests: 699     i.e. dm_request_add() was called
Ranging Successes: 626    i.e. Quality == "ok"
 
Looking at just (Ranging Failures) / (Ranging Requests): 8.3% failure rate
Looking at (Ranging Failures + CRC Failures) / (Ranging Requests): 10.4% failure rate
 
From the "reflector" perspective:
Total Failures: 110
  Ranging Failures: 74    i.e. data_ready() never called
  CRC Failures: 11        i.e. Quality == "crc fail"
  Scanning Failures: 25   i.e. adv_scanned_cb() never called
Ranging Requests: 712     i.e. dm_request_add() was called
Ranging Successes: 627    i.e. Quality == "ok"
 
Looking at just (Ranging Failures) / (Ranging Requests): 10.4% failure rate
Looking at (Ranging Failures + CRC Failures) / (Ranging Requests): 11.9% failure rate
  
 
Logic Analyzer
I'm seeing two main types of "ranging" failures in the logic analyzer. Note that I added a "success" GPIO that is toggled when the data_ready() callback indicates a ranging event completed. 
 
Success
For comparison purposes, you can see both "reflector" (blue) and "initiator" (red) get the dm request, start ranging, and get a result.
 
Failure Mode #1 - Short Ranging Window
The time spent "ranging" on both the "reflector" (blue) and "initiator" (red) is truncated in this failure mode. You can see the ranging window is within a few milliseconds, and neither device gets results.
 
Failure Mode #2 - Long Ranging Window
The time spent "ranging" on both the "reflector" (blue) and "initiator" (red) is NOT truncated in this failure mode. You can see the ranging window is within a few dozen milliseconds, and neither device gets results.
 
  • Unfortunately, I've moved on to other tasks so I can't spend time looking into this any more. However, your 1% failure rate is quite surprising to me, considering I repeated this test at my house and saw failure rates closer to 5%

    Here's my results from that test (which I ran overnight), although unfortunately I don't have any logs like you're asking - just the statistics below. 

    initiator_failure: 634
    initiator_failure_range: 476
    initiator_failure_crc: 61
    initiator_failure_scan: 97
    initiator_success: 9449
    initiator_request: 9986
    initiator_total: 10083

    Looking at just (Ranging Failures) / (Ranging Requests): 4.7% failure rate
    Looking at (Ranging Failures + CRC Failures) / (Ranging Requests): 5.4% failure rate

    reflector_failure: 967
    reflector_failure_range: 736
    reflector_failure_crc: 226
    reflector_failure_scan: 5
    reflector_success: 9116
    reflector_request: 10078
    reflector_total: 10083

    Looking at just (Ranging Failures) / (Ranging Requests): 7.3% failure rate
    Looking at (Ranging Failures + CRC Failures) / (Ranging Requests): 9.5% failure rate

    Thanks for your help,

    Cal

  • Hi,

    I see that you still have a lot of range failures, but also crc and scan.

    I am not quite able to read the version numbers and production date from the stickers on the boards. What are they, for the two boards? I can potentially check against known issues with SoC and/or DK versions.

    I see on the photograph of your setup, that one DK rests on top of a breadboard. Antennas are affected by conductive material in their near field. It may be as simple as the copper in the breadboard changing the electrical surroundings of the antenna enough to affect RF performance. The antenna is at the edge of the DK, under the Nordic Semiconductor logo and text. Rotating the DK by 90 degrees, so that the antenna end sticks out by at least 1-2 cm (half an inch or so) from the breadboard below, should be enough to eliminate that potential error source.

    I am afraid for further debugging you should either get better logs or do a debug session. Figuring out where exactly it fails is key to understanding why you see the bad performance, and consequently to solve it.

    A BLE sniffer trace could also provide some information, but that would be for the BLE part only. (E.g. show timing of advertising, scan request and scan response, as well as CRC failures for those as seen by the sniffer.)

    Regards,
    Terje

Related