This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Bonding with FDS and DFU sometimes fail

Hi

I am using nRF52DK, SDK 15.0, SD 6.0.0, multiperipheral based example with secure bootloader, FDS (7 FDS_VIRTUAL_PAGES). Uploading new software works (not tested thorougly).

Project building with SES, then flashing with a script:

SET family= NRF52
SET application= C:\repo\pikflometr\multiperipheral\pca10040\s132\ses\Output\Release\Exe\ble_app_multiperipheral_pca10040_s132.hex
SET application-version= 1
SET soft_device= C:\repo\pikflometr\lib\wszystkoCzegoPotrzebujesz\s132_nrf52_6.0.0_softdevice.hex
SET bootloader= C:\repo\pikflometr\secure_bootloader\pca10040_ble\ses\Output\Release\Exe\secure_bootloader_ble_s132_pca10040.hex
SET bootloader-version= 1
SET bl-settings-version= 1
SET settings= settings_1.hex
SET dfu_client= dfu_client_1.hex
SET nrfutil_location= C:\repo\pikflometr\lib
cd %nrfutil_location%
nrfjprog.exe --family %family% --program %soft_device% --chiperase --verify
nrfutil settings generate --family %family% --application %application% --application-version %application-version% --bootloader-version %bootloader-version% --bl-settings-version %bl-settings-version% %settings%
mergehex -m %application% %settings% -o %dfu_client%
nrfjprog.exe --family %family% --program %dfu_client% --verify
nrfjprog.exe --reset --family %family% --program %bootloader% --verify

then F5 to begin debugging with SES. 

Sometimes (maybe 15% of times) when I try to pair (using master Androd device) I get:

<error> app: ERROR 2 [NRF_ERROR_SOFTDEVICE_NOT_ENABLED] at C:\repo\pikflometr\multiperipheral\main.c:1808 PC at: 0x00033AA7

which is from PM_EVT_ERROR_UNEXPECTED case

static void pm_evt_handler(pm_evt_t const * p_evt)
{
    ret_code_t err_code;
    switch(p_evt->evt_id)
    {

        case PM_EVT_ERROR_UNEXPECTED:
        NRF_LOG_RAW_INFO("PM_EVT_ERROR_UNEXPECTED \n");
            // Assert.
            APP_ERROR_CHECK(p_evt->params.error_unexpected.error);
            
...}

It tells me that NRF_ERROR_SOFTDEVICE_NOT_ENABLED  which is silly because I have already connected with an advertising device, so BLE stack is up and running.

After some digging I have found out that peer_data_delete_process() calls send_unexpected_error()

  

So it looks like sometimes the FDS module does not initialize.

I added some logs and here is what I got after booting:

FAILING CASE:

<info> app: __________________start
<info> app: ~~ ZMIANA NAZWY (na v1) W: gap_params_init
<info> app: Setting vector table to bootloader: 0x00078000
<info> app: Setting vector table to main app: 0x00026000
<info> app: zostaje domyslny adv_name
<error> app: bzdurne wartosci ts:0, cnt:0
<info> app: nie ma w ogole pomiarow offline, bo secret_word to: 26DB1 wobec 0xAAAAAAAA
<info> app: epoch 0
<info> app: battery_meas handler rusza
<info> app: reset_reason: 0x00000004.

WORKING CASE:

<info> app: __________________start
<info> app: ~~ ZMIANA NAZWY (na v1) W: gap_params_init
<info> app: Setting vector table to bootloader: 0x00078000
<info> app: Setting vector table to main app: 0x00026000
.......fds..FDS_EVT_INIT
.......fds..FDS_EVT_WRITE
.......fds..FDS_EVT_UPDATE
.......fds..FDS_EVT_DEL_RECORD
<info> app: zostaje domyslny adv_name
<error> app: bzdurne wartosci ts:0, cnt:0
<info> app: nie ma w ogole pomiarow offline, bo secret_word to: 26DB1 wobec 0xAAAAAAAA
<info> app: epoch 0
<info> app: battery_meas handler rusza
<info> app: reset_reason: 0x00000001.

So all the difference is the FDS event logs don't appear occassionaly.

The logs that makes difference between failing and working case are in peer_data_storage.c:

static void fds_evt_handler(fds_evt_t const * const p_fds_evt)
{
    pm_evt_t pds_evt =
    {
        .peer_id = file_id_to_peer_id(p_fds_evt->write.file_id)
    };

    switch (p_fds_evt->id)
    {   
        case FDS_EVT_INIT:
            NRF_LOG_RAW_INFO(".......fds..FDS_EVT_INIT\n");
        case FDS_EVT_WRITE:
            NRF_LOG_RAW_INFO(".......fds..FDS_EVT_WRITE\n");
        case FDS_EVT_UPDATE:
            NRF_LOG_RAW_INFO(".......fds..FDS_EVT_UPDATE\n");
        case FDS_EVT_DEL_RECORD:
            NRF_LOG_RAW_INFO(".......fds..FDS_EVT_DEL_RECORD\n");
            
.....}

---

Furthermore I have added those logs to observe the state of flash memory:

     NRF_LOG_INFO("pages_available %d open_records %d valid_records %d",
                   stat.pages_available,stat.open_records,stat.valid_records);            
     NRF_LOG_INFO("dirty_records %d words_reserved %d words_used %d",
                  stat.dirty_records,stat.words_reserved,stat.words_used);            
     NRF_LOG_INFO("largest_contig %d freeable_words %d corruption %d",
                  stat.largest_contig,stat.freeable_words,stat.corruption); 

And here is the output (called after booting before trying to pair):

FAILING CASE:

<info> app: pages_available 0 open_records 0 valid_records 0
<info> app: dirty_records 0 words_reserved 0 words_used 0
<info> app: largest_contig 0 freeable_words 0 corruption 0

WORKING CASE:

<info> app: pages_available 7 open_records 0 valid_records 0
<info> app: dirty_records 0 words_reserved 0 words_used 12
<info> app: largest_contig 1022 freeable_words 0 corruption 0

What more info do you need to guide me to solve this?

Thanks

  • Also sd_ble_gap_connect(...) is not being called at all in my application

    Yes, but assume you're starting connectable advertising so a nearby central could chose to send a  connect request? I'd recommend to postpone advertising until pending flash operations are complete like the adv. module. 

    If your guess is correct the return code does not explain the reason properly and the error caught by

    Return code from which function? A NRF_SUCCESS from the  sd_flash APIs only means the task got accepted into the flash queue. The result of the operation will be reported to the fds callback. 

  • Oh yes of course, it is connectable advertising. And it advertises until given number of masters are connected (multiperipheral). 

    You suggest postponing starting advertising so flash initialization can be performed completely because it is essential. But what about potential collision between normal periodical advertising packet and occassional flash write? High probability of flash operation success means that there is a tiny risk of FDS operation failure we should accept?

    By return code I meant NRF_ERROR_SOFTDEVICE_NOT_ENABLED when trying to pair.

  • Ok, so that may explain why it's not failing consistently. See "SoftDevice timing-activities and priorities" for details on how the scheduler priorities the different events/tasks. 

    Pawel Niedbala said:
    High probability of flash operation success means that there is a tiny risk of FDS operation failure we should accept?

     The app may retry the flash operation later. Writing small chunks of data should not be a problem, but flash erase (garbage collection) is harder to schedule.

    Pawel Niedbala said:
    By return code I meant NRF_ERROR_SOFTDEVICE_NOT_ENABLED when trying to pair.

     It's definitely misleading. That is why the FDS errors now use a different number range. 

Related