Random Bonding problem

We experience that under some conditions at one moment an already bonded device causes that at the moment of BLE connection the MCU crashes. Possible solution is only to erase all bonding's which is not acceptable at the current stage in the field.

We use the latest 17.1.0 SDK

In debugging this issue we see that lowest we can track the problem is in sd_ble_gatts_sys_attr_get (components/ble/peer_manager/gatt_cache_manager.c) where the function returns NRF_ERROR_DATA_SIZE.

The structure which is inserted into the function has at the moment the following values:
peer_data.p_local_gatt_db.len = 62

peer_data.p_local_gatt_db.flags = 3

As we have noticed is not a phone specific problem, neither OS android/ios. The same phone can then also later normally connect to other devices

Parents
  • Hello,

    It is not clear how the program is crashing even if sd_ble_gatts_sys_attr_get() is returning with NRF_ERROR_DATA_SIZE. Are you able to get debug logs from the device? And does the program enter the app_error_fault_handler()?

    Best regards,

    Vidar

  • Hi Vidar,

    It took us a while to reproduce it again, and once more I can confirm the error code=12. Bellow are the logs:
    [0m<info> peer_manager_handler: Peer data updated in flash: peer_id: 0, data_id: Peer rank, action: Update, no change[0m
    [1;33m<warning> peer_manager_gcm: The local database has changed, so some subscriptions to notifications and indications could not be restored for conn_handle 0[0m
    [1;33m<warning> peer_manager_handler: Local DB could not be applied: conn_handle: 0, peer_id: 0[0m
    [0m<info> peer_manager_handler: Connection secured: role: Peripheral, conn_handle: 0, procedure: Encryption[0m
    [0m<info> peer_manager_handler: Peer data updated in flash: peer_id: 0, data_id: Peer rank, action: Update, no change[0m
    [1;31m<error> app: ASSERTION FAILED at /home/dani/projects/tq/hpr-disp-04-05-fw/nRF5-SDK/components/ble/peer_manager/gatt_cache_manager.c:300

    I forgot to mention that we have included the fix
    RE: Peer manager and FDS, new records at each connection.

    which as I see now was implemented by you.

    Additionally, I noticed that this is caused by the 0x2902 Service changed.

  • Hi,

    The issue discussed in the linked Q&A was already fixed in nRF5 SDK 17.1.0:

    *** Bug fixes
    ****************

    NRFFOETT-2519: The local db data will not be written to flash if it is already up to date.
    This means that if a client updates a CCCD to the same value as before,
    no flash operation will happen.

    I recommend reverting your changes to the peer manager if it's not too late and use the official fix which was tested as a part of the SDK release.

    DaniSi said:
    [1;33m<warning> peer_manager_gcm: The local database has changed, so some subscriptions to notifications and indications could not be restored for conn_handle 0[0m
    [1;33m<warning> peer_manager_handler: Local DB could not be applied: conn_handle: 0, peer_id: 0[0m

    This indicates that the attribute table has changed. Was there anything that happened prior to this (e.g., a device firmware update)? 

    DaniSi said:
    [1;31m<error> app: ASSERTION FAILED at /home/dani/projects/tq/hpr-disp-04-05-fw/nRF5-SDK/components/ble/peer_manager/gatt_cache_manager.c:300

    What is the ASSERT you have at this line? This is also different from the error you reported initially. 

Reply
  • Hi,

    The issue discussed in the linked Q&A was already fixed in nRF5 SDK 17.1.0:

    *** Bug fixes
    ****************

    NRFFOETT-2519: The local db data will not be written to flash if it is already up to date.
    This means that if a client updates a CCCD to the same value as before,
    no flash operation will happen.

    I recommend reverting your changes to the peer manager if it's not too late and use the official fix which was tested as a part of the SDK release.

    DaniSi said:
    [1;33m<warning> peer_manager_gcm: The local database has changed, so some subscriptions to notifications and indications could not be restored for conn_handle 0[0m
    [1;33m<warning> peer_manager_handler: Local DB could not be applied: conn_handle: 0, peer_id: 0[0m

    This indicates that the attribute table has changed. Was there anything that happened prior to this (e.g., a device firmware update)? 

    DaniSi said:
    [1;31m<error> app: ASSERTION FAILED at /home/dani/projects/tq/hpr-disp-04-05-fw/nRF5-SDK/components/ble/peer_manager/gatt_cache_manager.c:300

    What is the ASSERT you have at this line? This is also different from the error you reported initially. 

Children
  • I recommend reverting your changes to the peer manager if it's not too late and use the official fix which was tested as a part of the SDK release.

    Ok, I've reverted back. We will probably need some more time to again verify if the problem still persist.

    This indicates that the attribute table has changed. Was there anything that happened prior to this (e.g., a device firmware update)? 

    The last scenario occurred during development. We have different branches so maybe attribute table has changed, but I cant confirm. But we got already a report that suddenly crashes started occurring with no changes in attribute table.

    What is the ASSERT you have at this line? This is also different from the error you reported initially. 

    The assert happens at the same spot as I wrote in the first topic. In the modified code this was line 300

    err_code = sd_ble_gatts_sys_attr_get(conn_handle, sys_attr_data, &len, peer_data.p_local_gatt_db->flags);

    ASSERT(err_code == NRF_SUCCESS);

  • DaniSi said:
    Ok, I've reverted back. We will probably need some more time to again verify if the problem still persist.

    Sounds good thanks. You will not get the exact same error at least.

    DaniSi said:

    The assert happens at the same spot as I wrote in the first topic. In the modified code this was line 300

    err_code = sd_ble_gatts_sys_attr_get(conn_handle, sys_attr_data, &len, peer_data.p_local_gatt_db->flags);

    ASSERT(err_code == NRF_SUCCESS);

    I see. I realize now that it is from the code I posted and that it does not exist in the original SDK 17.1.0 code which I was looking at.

Related