Reasons why bonds are deleted?

We have a custom board in production, which is nRF52832, s132, nrf5 SDK. 

Below I provide you some code regarding the peer manager. We use a static key to pair and connect to the devices. Usually only one device will connect to the board. 

We recently have faced some issues regarding connection from our clients. The error 0x1006 from security fail event from peer manager is triggered, and the solution is always to unpair and pair again from the mobile phone. But, we cannot understand why this happens, what can trigger the bonds to be deleted? Or is there a way that we can track the reason from the event  PM_EVT_PEERS_DELETE_SUCCEEDED? 

The peer manager initialization:

/**@brief Function for the Peer Manager initialization.
 */
static void peer_manager_init(void)
{
    ble_gap_sec_params_t sec_param;
    ret_code_t           err_code;

    err_code = pm_init();
    hardfault = app_error_check_logger(err_code, true, log_str[PEER_MANAGER_INIT], 0, NULL);

    memset(&sec_param, 0, sizeof(ble_gap_sec_params_t));

    // Security parameters to be used for all security procedures. These are common parameters for bonding.
    sec_param.bond           = 1;
    sec_param.mitm           = 0;
    sec_param.lesc           = 0;
    sec_param.keypress       = 0;
    sec_param.io_caps        = BLE_GAP_IO_CAPS_DISPLAY_ONLY;
    sec_param.oob            = 0;
    sec_param.min_key_size   = 7;
    sec_param.max_key_size   = 16;
    sec_param.kdist_own.enc  = 1;
    sec_param.kdist_own.id   = 1;
    sec_param.kdist_peer.enc = 1;
    sec_param.kdist_peer.id  = 1;

    err_code = pm_sec_params_set(&sec_param);   // sets security parameters for pairing and bonding
    hardfault = app_error_check_logger(err_code, true, log_str[PEER_MANAGER_INIT], 1, NULL);

    err_code = pm_register(pm_evt_handler);   // register an event handler for the module
    hardfault = app_error_check_logger(err_code, true, log_str[PEER_MANAGER_INIT], 2, NULL);
}

The peer manager event handler :

static void pm_evt_handler(pm_evt_t const * p_evt)
{
    pm_handler_on_pm_evt(p_evt);    // Logging peer events. Starts encryption if connected to a bonded device.
    pm_handler_disconnect_on_sec_failure(p_evt);    // Disconnects if the connection was not secured.
    pm_handler_flash_clean(p_evt);

    switch (p_evt->evt_id)
    {
        case PM_EVT_CONN_SEC_SUCCEEDED:   //a link has been encrypted, result of a call of pm_conn_secure or of an action by the peer.
            m_peer_id = p_evt->peer_id;
            pm_local_database_has_changed();
            break;

        case PM_EVT_PEERS_DELETE_SUCCEEDED:   // a peer was cleared from flash storage (result of pm_peer_delete)
            NRF_LOG_INFO("PM_EVT_PEERS_DELETE_SUCCEEDED");
            hardfault = app_error_check_logger(1, false, "peersdel", 1, NULL);
            advertising_start(false);
            break;

        case PM_EVT_PEER_DATA_UPDATE_SUCCEEDED: // a piece of peer data was tored, updated or cleared in flash storage.
            if (     p_evt->params.peer_data_update_succeeded.flash_changed
                 && (p_evt->params.peer_data_update_succeeded.data_id == PM_PEER_DATA_ID_BONDING))
            {
                NRF_LOG_INFO("New Bond. Peer data update succeeded.");
            }
            break;
        case PM_EVT_CONN_SEC_FAILED:    // a pairing or encryption procedure has failed. in some cases, this means that security is not possible on this link.
        {
            NRF_LOG_INFO("Pairing or Encryption procedure failed. Peer id=%d, Error=%x, Procedure=%d, Source=%d", p_evt->peer_id, p_evt->params.conn_sec_failed.error,
                                                                                        p_evt->params.conn_sec_failed.procedure,p_evt->params.conn_sec_failed.error_src);
            uint8_t proc=0;
            if(p_evt->params.conn_sec_failed.error_src == 1)
            {
                  proc = p_evt->params.conn_sec_failed.procedure + 10;
            }
            else
            {
                  proc = p_evt->params.conn_sec_failed.procedure;
            }
            hardfault = app_error_check_logger(p_evt->params.conn_sec_failed.error, false, "secfail", proc, NULL);
            m_scoliosense.err_code = p_evt->params.conn_sec_failed.error;
            ble_error_sec_update(m_conn_handle, &m_scoliosense);
         }
            break;
         case PM_EVT_CONN_SEC_CONFIG_REQ:
         {
            // Allow or reject pairing request from an already bonded peer.
            NRF_LOG_INFO("Repairing Process was initiated.");
            pm_conn_sec_config_t conn_sec_config = {.allow_repairing = true};
            pm_conn_sec_config_reply(p_evt->conn_handle, &conn_sec_config);
         } 
         break;

        default:
            break;
    }
}

Any help is much appreciated! 

Best regards,
Dimitra

  • hello, 

    We delete bonds when the button is pressed on the board, as a factory reset. But the customers claim that bonds are deleted without having pressed the button. The mechanism is that after reset, we read the NRF_POWER->RESETREAS register and if it is 0x01 then we know that a pin-reset occured, and therefore we delete bonds . 

    About pm_handler_flash_clean, do you think it is possible or safe, to change it and make it not delete the last peer ("Deletes the lowest ranked peer(s) when garbage collection is insufficient.") but rather I delete records to free up space... 

    Where can I find how much space the Peer Manager needs? 
    Here https://docs.nordicsemi.com/bundle/sdk_nrf5_v17.0.2/page/lib_fds_functionality.html, it says that peer manager uses 0xC000 file IDs and Record keys until 0xFFFF. But how much space does it really need? 

    Finally, in my application Flash size (project section placement)  is 0x5a000 ( = 368640) . 
    When I build it in Segger Embedded Studio, it says that 233,7 KB are used.
    On the other hand, nRF52832 has 512 kB, we use Softdevice (152 kB) and bootloader (24 kB), and then the space left for the application is 336 kB.


    In sdk_config.h, I have set 13 virtual pages (which peer manager also uses), and virtual page size is 2048 words. 1 page it says that is used by GC, so, I have available 2048 * 4 * 12 = 98304 bytes ? 
    I am a bit confused about the real space that is left for my data and the peer manager, in order to monitor the flash usage and free up space when needed.. Could you help me with this too? 

    Thank you very much for your help

    Dimitra

  • Hello Dimitra,

    DimitraN said:
    About pm_handler_flash_clean, do you think it is possible or safe, to change it and make it not delete the last peer ("Deletes the lowest ranked peer(s) when garbage collection is insufficient.") but rather I delete records to free up space... 

    It is possible, but it may require some rework of the PM implementation. However, if running GC is not sufficient, the bonding will fail. It seems better to avoid this issue altogether by running GC and deleting specific records before reaching this situation.

    DimitraN said:
    Where can I find how much space the Peer Manager needs? 
    Here https://docs.nordicsemi.com/bundle/sdk_nrf5_v17.0.2/page/lib_fds_functionality.html, it says that peer manager uses 0xC000 file IDs and Record keys until 0xFFFF. But how much space does it really need? 

    The size of each bond depends on how many CCCD's that need to be stored. You can use the fds_stat function before and after pairing to determine the size requirements in your applicaiton.

    DimitraN said:
    Finally, in my application Flash size (project section placement)  is 0x5a000 ( = 368640) . 
    When I build it in Segger Embedded Studio, it says that 233,7 KB are used.
    On the other hand, nRF52832 has 512 kB, we use Softdevice (152 kB) and bootloader (24 kB), and then the space left for the application is 336 kB.

    233.7 kB is including the Softdevice if you are looking at the build output. You can remove this highlighted line in your flash_placement.xml if you want it to only show the size of the app.

    DimitraN said:
    In sdk_config.h, I have set 13 virtual pages (which peer manager also uses), and virtual page size is 2048 words. 1 page it says that is used by GC, so, I have available 2048 * 4 * 12 = 98304 bytes ? 
    I am a bit confused about the real space that is left for my data and the peer manager, in order to monitor the flash usage and free up space when needed.. Could you help me with this too? 

    One page is reserved for GC, which leaves you with 12 pages * 2048 words for storing data records. Each page uses two words for the page tag.  fds_stat_t::words_used includes the page tag.

    Best regards,

    Vidar

  •  fds_stat_t::words_used includes the page tag.

    ok, but do words used include dirty words too?

    so if I understand I have 12 pages * (2048 words - 2 words page tag) = 24552 words available to use for my data and for peer manager. 
    Peer manager uses for one bond (and peer rank, service pending flag etc) 53 words , if I read words_used from fds_stat. 

    so let's say that I want to leave a lot of space available for peer manager, 1 whole page. 
    the available space becomes: 11 pages * 2046 words = 22506 words. 

    do i check then when words_used >= 22506 --> delete some records 
    or words_used + freeable_words >= 22506 --> delete some records 

    because I call regularly GC , and I can see that freeable words become 0 when gc succeeds, but I am not sure whether freeable words are included or not in words used. If they are not included then I must consider them in the calculation. 

    e.g. if the limit of "full" space is 100 words. and words used = 100 and freeable = 10, and words used do not include freeable words, then i have exceeded the limit, and I will have a problem even if i call gc. 
    on the other hand, if words_used =100 and freeable = 10, but they're included in words_used, then truly I have 90 words and have time to call gc, and then delete extra records when needed. 

     I also want to apologize for the long details, but I have been working on this for a long time and maybe I confuse myself without a reason.. 


    Best regards, 

    Dimitra 

  • Hi Dimitra,

    It's best to not perform GC too frequently as it leads to increased flash wear. Freeable words are not included in words used. Also, if possible, avoid running GC in connections as it can lead to scheduling conflicts. Running GC on this many pages is relatively time consuming.

    DimitraN said:
    if the limit of "full" space is 100 words. and words used = 100 and freeable = 10, and words used do not include freeable words, then i have exceeded the limit, and I will have a problem even if i call gc. 

    This is correct. So in this case you would have to delete some records to get more "freeable" words.

    Best regards,

    Vidar

  • It's best to not perform GC too frequently as it leads to increased flash wear.

    But what is too frequently? 

    We have a record that is updated in fds every 30 seconds all day. This creates 6 freeable words every 30 seconds. 12 freeable words every 1 minute * 60 * 24 = 17280 freeable words a day. 
    If we call GC for example 2 times a day to free up these words, is it frequently?

    Because the limit I will put is 22400 words, where I consider that flash is full, and I stop saving new records and start deleting old ones.  in order to always leave one page available for peer manager, so that it will not delete the bonds.. 

    So, is 2 times a day calling GC frequent? If yes, then the only thing I can do is call it ONLY when freeable_words + words_used >= limit of 22400. 

    Another question after testing, I delete records when I send them over BLE to mobile.. at this point, I expect that almost all words_used become freeable_words, but sometimes fds_stat returns almost equal amount to both variables, like freeable words are updated and words_used have not been updated yet. is this a case? have you observed this kind of a behaviour?

    Best regards,
    Dimitra

Related