Beware that this post is related to an SDK in maintenance mode
More Info: Consider nRF Connect SDK for new designs

[MESH SDK] PB REMOTE - Device crash when provisionning

Hi,

I try to implement the PB Remote feature on device.

I have setup a pb remote server and a pb remote client on two devices. Both are configured.

But when a provisioning a device using the pb client and a pb remote as proxy, the currently provisioning device crash after the "Sending static auth keys" step but the pb client and server do not crashed and show a "provisioning failed" message.

On the crashed device, I have no callstack or log.

If I enable DEBUG & DEBUG_NR, I got : "<error> app: SOFTDEVICE: ASSERTION FAILED"

When I try to provisioning the device in "regular" mode, there is no issues.

Used softwares :

NRF SDK 17.1.0

MESH SDK 5.0.0

Regards

  • Hi,

    There can be a lot of devices involved in remote provisioning, as well as a lot of terminology. Therefore, can you please confirm my assumptions and provide the missing details as per my questions below?

    I will be referring to the following setup, from the Remote provisioning (PB-remote) documentation:

    But when a provisioning a device

    1) I assume this device is the device shown as "Provisionee" in the figure?

    using the pb client

    2) I assume the pb client is the PBR Client (Provisioner) in the figure?

    and a pb remote as proxy

    3) I am a bit confused which device and what role you mean here. Is it one of the devices in the above figure (if so which one)?

    4) By "proxy", do you mean a device which you connect to through use of the GATT Proxy feature, or do you mean proxy in terms of provisioning, or other?

    the currently provisioning device crash

    5) Is this the PBR Client (Provisioner), or is it the PBR Server, in the figure, or is it another device?

    after the "Sending static auth keys" step

    I am afraid I do need some more pointers to what you are referring to here:

    6) Did you find this step mentioned in a log message? If so what is the verbatim log message, and from which device?

    7) Do you refer to a place in the documentation? (If so where?)

    If I enable DEBUG & DEBUG_NR, I got : "<error> app: SOFTDEVICE: ASSERTION FAILED"

    This error message comes from the application, and there are no such error log messages in any of the examples in the SDK. Therefore it most likely comes from your specific application, for instance from a custom assert handler. The wording of the log message indicates that something went wrong in the SoftDevice (the BLE stack.) Typically, if using the default error handling in the nRF5 SDK for Mesh, you will at least get a memory address for where the assert happens. With that address we are able to look into it from our side, but here there is no information about what or where it crashes.

    8) On which device do you get this output, and what is the code in your application responsible for printing this message.

    Regards,
    Terje

  • Hi,

    This is the schema for the devices that I mentioned :

    The "Sending static auth keys" was showed on the provisioner and it's the last message got before a provisioning failed message. This is the function in my "provisioner_helper.c" that show this message :

    static void prov_evt_handler(const nrf_mesh_prov_evt_t * p_evt)
    {
        static dsm_handle_t addr_handle;
        static dsm_handle_t devkey_handle;
    
        switch (p_evt->type)
        {
          case NRF_MESH_PROV_EVT_UNPROVISIONED_RECEIVED:
          {
            if (!has_prov_request)
            {
              serial_unprov_found(p_evt->params.unprov.device_uuid);
            }
            break;
          }
    
          case NRF_MESH_PROV_EVT_LINK_CLOSED:
          {
            NRF_LOG_INFO("Local provisioning link closed: prov_state: %d\n", m_prov_state);
            if (m_prov_state == PROV_STATE_PROV)
            {
                m_prov_state = PROV_STATE_IDLE;
                m_provisioner.p_prov_failed_cb();
            }
            else if (m_prov_state == PROV_STATE_COMPLETE)
            {
              serial_prov_step_cb(PROV_STATE_LINK_CLOSE);
    
              m_provisioner.p_nw_data->provisioned_devices++;
              m_provisioner.p_prov_success_cb(m_provisioner.p_nw_data->last_device_address);
    
              NRF_LOG_INFO("Provisioning complete. %s - address: 0x%04x elements: %x\n", m_provisioner.p_nw_data->current_uri, m_provisioner.p_nw_data->last_device_address, m_target_elements);
    
              // Start device configuration after provisioning is complete.
              node_setup_start(m_provisioner.p_nw_data->last_device_address,
                               PROVISIONER_RETRY_COUNT,
                               m_provisioner.p_nw_data->appkey,
                               APPKEY_INDEX,
                               NETKEY_INDEX,
                               m_provisioner.p_nw_data->current_uri,
                               true);
              m_prov_state = PROV_STATE_IDLE;
            }
            break;
          }
    
          case NRF_MESH_PROV_EVT_COMPLETE:
          {
            NRF_LOG_INFO("Provisioning completed received\n");
            m_prov_state = PROV_STATE_COMPLETE;
    
            // Generate custom devkey (same generation are done in the device side, so if you change it, you also need to change it in the device)
            uint8_t custom_devkey[NRF_MESH_KEY_SIZE];
            for (uint8_t i=0; i<NRF_MESH_KEY_SIZE; i++)
            {
              custom_devkey[i] = prov_uuid[i%6];
            }
    
            //After provisioning completes, add corresponding node's address and device key to local database
            NRF_LOG_INFO("Adding device address, and device keys\n");
    
            if (dsm_address_publish_add(p_evt->params.complete.p_prov_data->address, &addr_handle) != NRF_SUCCESS)
            {
              NRF_LOG_INFO("Prov complete error : dsm_address_publish_add");
            }
            else if (dsm_devkey_add(p_evt->params.complete.p_prov_data->address, m_provisioner.p_dev_data->m_netkey_handle, custom_devkey, &devkey_handle) != NRF_SUCCESS)
            {
              NRF_LOG_INFO("Prov complete error : dsm_devkey_add");
            }
            else if (config_client_server_bind(devkey_handle) != NRF_SUCCESS)
            {
              NRF_LOG_INFO("Prov complete error : config_client_server_bind");
            }
            else if (config_client_server_set(devkey_handle, addr_handle) != NRF_SUCCESS)
            {
              NRF_LOG_INFO("Prov complete error : config_client_server_set");
            }
    
            NRF_LOG_INFO("Addr: 0x%04x addr_handle: %d netkey_handle: %d devkey_handle: %d\n", p_evt->params.complete.p_prov_data->address, addr_handle, m_provisioner.p_dev_data->m_netkey_handle, devkey_handle);
            break;
          }
    
          case NRF_MESH_PROV_EVT_CAPS_RECEIVED:
          {
            m_target_elements = p_evt->params.oob_caps_received.oob_caps.num_elements;
    
            uint32_t status = nrf_mesh_prov_oob_use(p_evt->params.oob_caps_received.p_context, NRF_MESH_PROV_OOB_METHOD_STATIC, 0, NRF_MESH_KEY_SIZE);
            if (status != NRF_SUCCESS)
            {
                NRF_LOG_INFO("Provisioning Failed. Cannot select static OOB. Could not assign node addr: 0x%04x\n", m_provisioner.p_nw_data->last_device_address);
                m_prov_state = PROV_STATE_IDLE;
                m_provisioner.p_prov_failed_cb();
            }
            break;
          }
    
          case NRF_MESH_PROV_EVT_STATIC_REQUEST:
          {
            const uint8_t static_data[NRF_MESH_KEY_SIZE] = STATIC_AUTH_DATA;
            if (nrf_mesh_prov_auth_data_provide(p_evt->params.static_request.p_context, static_data, NRF_MESH_KEY_SIZE) != NRF_SUCCESS)
            {
              NRF_LOG_INFO("Static authentication data provided failed\n");
              m_prov_state = PROV_STATE_IDLE;
              m_provisioner.p_prov_failed_cb();
              break;
            }
            NRF_LOG_INFO("Static authentication data provided\n");
            serial_prov_step_cb(PROV_STATE_AUTH_DATA);
            break;
          }
    
          case NRF_MESH_PROV_EVT_LINK_ESTABLISHED:
          {
            NRF_LOG_INFO("Provisioning link established\n");
            serial_prov_step_cb(PROV_STATE_LINK_OPEN);
            break;
          }
    
          default:
            break;
        }
    
    }

    Yes I use a custom assetion_handler that is :

    #include "app_error.h"
    #if NRF_SD_BLE_API_VERSION == 1
    #include "nordic_common.h"
    #endif
    
    #include "nrf_log.h"
    #include "nrf_log_ctrl.h"
    #include "app_util_platform.h"
    #include "nrf_strerror.h"
    
    #if defined(SOFTDEVICE_PRESENT) && SOFTDEVICE_PRESENT
    #include "nrf_sdm.h"
    #endif
    
    
    void mesh_assertion_handler(uint32_t pc)
    {
        assert_info_t assert_info =
        {
            .line_num    = 0,
            .p_file_name = (uint8_t *)"",
        };
    #if NRF_SD_BLE_API_VERSION == 1
        app_error_handler(NRF_FAULT_ID_SDK_ASSERT, pc, (const uint8_t *) "error");
    #elif NRF_SD_BLE_API_VERSION >= 2
        app_error_fault_handler(NRF_FAULT_ID_SDK_ASSERT, pc, (uint32_t)(&assert_info));
    #endif
    
        UNUSED_VARIABLE(assert_info);
    }
    
    void app_error_fault_handler(uint32_t id, uint32_t pc, uint32_t info)
    {
        __disable_irq();
        NRF_LOG_FINAL_FLUSH();
    
    #ifndef DEBUG
        NRF_LOG_ERROR("Fatal error");
    #else
        switch (id)
        {
    #if defined(SOFTDEVICE_PRESENT) && SOFTDEVICE_PRESENT
            case NRF_FAULT_ID_SD_ASSERT:
                NRF_LOG_ERROR("SOFTDEVICE: ASSERTION FAILED");
                break;
            case NRF_FAULT_ID_APP_MEMACC:
                NRF_LOG_ERROR("SOFTDEVICE: INVALID MEMORY ACCESS");
                break;
    #endif
            case NRF_FAULT_ID_SDK_ASSERT:
            {
                assert_info_t * p_info = (assert_info_t *)info;
                NRF_LOG_ERROR("ASSERTION FAILED at %s:%u",
                              p_info->p_file_name,
                              p_info->line_num);
                break;
            }
            case NRF_FAULT_ID_SDK_ERROR:
            {
                error_info_t * p_info = (error_info_t *)info;
                NRF_LOG_ERROR("ERROR %u [%s] at %s:%u\r\nPC at: 0x%08x",
                              p_info->err_code,
                              nrf_strerror_get(p_info->err_code),
                              p_info->p_file_name,
                              p_info->line_num,
                              pc);
                 NRF_LOG_ERROR("End of error report");
                break;
            }
            default:
                NRF_LOG_ERROR("UNKNOWN FAULT at 0x%08X", pc);
                break;
        }
    #endif
    
        NRF_BREAKPOINT_COND;
        // On assert, the system can only recover with a reset.
    
    #ifndef DEBUG
        NRF_LOG_WARNING("System reset");
        NVIC_SystemReset();
    #else
        app_error_save_and_stop(id, pc, info);
    #endif // DEBUG
    }

    How can I get more information of the crash in this ?

    By chaning the log line with this :  NRF_LOG_ERROR("SOFTDEVICE: ASSERTION FAILED: 0x%x", pc);

    I got : <error> app: SOFTDEVICE: ASSERTION FAILED: 0x15BA4

    After more test, I the provisioning crashed at differents steps : sometimes after Auth Keys, sometimes during Models Configurations, sometimes just after the link opening. But each time I got the same error message, and the value for pc : 0x15BA4

    Regardst

  • Hi,

    The program counter value of 0x15BA4 is exactly what we need, yes. That is the exact place in the SoftDevice where the assert happens. This in turn will give us a clue as to what might be the main issue, and/or what should be investigated further.

    My main suspicion is that the mesh stack (which takes control of the radio in timeslots provided by the SoftDevice) overstays in the timeslot, leading to the SoftDevice detecting that BLE stack activity have missed timing events and therefore issuing an assert (which in production leads to reset.) This is a fault prevention measure in the SoftDevice. However this needs confirmation from the SoftDevice team, which may delay until Monday.

    In the mean time, to better understand your setup and what may be required for triggering the issue: What DKs, modules or custom boards are you using for the devices involved?

    Regards,
    Terje

  • Hi,

    What DKs, modules or custom boards are you using for the devices involved?

    I use a custom board based on an NRF52840 packaged inside a ublock nina.

    I use the s140_nrf52_7.2.0, the nrf sdk for mesh 5.0.0 and the nrf sdk 17.1.0

    However this needs confirmation from the SoftDevice team, which may delay until Monday.

    I will wait the confirmation from the SoftDevice team. Thank you

    My main suspicion is that the mesh stack (which takes control of the radio in timeslots provided by the SoftDevice) overstays in the timeslot, leading to the SoftDevice detecting that BLE stack activity have missed timing events and therefore issuing an assert (which in production leads to reset.)

    If it's that, what can I do to avoid it ?

    Is there a way to check this from the code?

    Regards

  • Another piece of information I forgot to mention : my projects are based on the examples of coexistence sdk because I need both uart and mesh.

    Can the potential conflict between timeslots come from this ?

Related