This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

nRF5 SDK for Mesh: How to reset a Bluetooth Mesh node?

Dear Nordic experts,
 
This short code snippet is supposed to reset (unprovision) a Bluetooth Mesh node and reboot it (nRF5 SDK for Mesh 5.0.0):
 

void node_reset(void)
{
    __LOG(LOG_SRC_APP, LOG_LEVEL_INFO, " >>>>> Resetting node <<<<< \n");
    if (mesh_stack_is_device_provisioned()) {
        mesh_stack_config_clear();   
        mesh_stack_device_reset();
    }
 
    schedule_reboot();  // just waits for 2 seconds before calling sd_nvic_SystemReset(); to reboot the node.
} 

 
node_reset() is called at the main() loop once an input pin is set. At the time it is called, the (custom/vendor) model is still operating. The problem is, that mesh_stack_config_clear() ends up in app_error_fault_handler().  
 
The following is printed at the debug terminal in SES:
 
<t:    2006463>, main.c,  252,  >>>>> Resetting node <<<<<  
<t:    2006553>, main.c,  213, Mesh event: NRF_MESH_EVT_CONFIG_STABLE
<t:    2006556>, main.c,  217, Mesh event: NRF_MESH_EVT_FLASH_STABLE
<t:    2006559>, main.c,  217, Mesh event: NRF_MESH_EVT_FLASH_STABLE
<t:    2006650>, main.c,  213, Mesh event: NRF_MESH_EVT_CONFIG_STABLE
<t:    2006653>, main.c,  217, Mesh event: NRF_MESH_EVT_FLASH_STABLE
<t:    2006656>, main.c,  217, Mesh event: NRF_MESH_EVT_FLASH_STABLE
<t:    2006747>, main.c,  213, Mesh event: NRF_MESH_EVT_CONFIG_STABLE
<t:    2006750>, main.c,  217, Mesh event: NRF_MESH_EVT_FLASH_STABLE
<t:    2006753>, main.c,  217, Mesh event: NRF_MESH_EVT_FLASH_STABLE
<t:    2006844>, main.c,  213, Mesh event: NRF_MESH_EVT_CONFIG_STABLE
<t:    2006847>, main.c,  217, Mesh event: NRF_MESH_EVT_FLASH_STABLE
<t:    2006850>, main.c,  217, Mesh event: NRF_MESH_EVT_FLASH_STABLE
<t:    2006869>, app_error_weak.c,  105, Mesh assert at 0x0002E65E (:0) 

 
And the stacktrace looks like this:
 
app_error_fault_handler()
mesh_assertion_handler()
backend_evt_handler()
write_complete_cb()
process_action_queue()
send_end_events()
bearer_event_handler()
bearer_event_flag_set()
mesh_config_backend_record_write()
dirty_entries_process()
mesh_config_entry_set()
seqnum_block_allocate()
mesh_stack_config_clear()
node_reset()
main() 

 
The questions now are:  
What's causing this issue? Or: How do you reset/unprovision and reboot a mesh node in nRF5 SDK for Mesh 5.0.0?  
 
Your help is very much appreciated,
Thank you,
Michael.

Parents
  • Hi Mike, 

    I would suggest mesh_stack_config_clear() instead of calling mesh_config_clear() 

    Can you reproduce the issue using one of our example ? For example the light switch server ? We do have the functionality that if you press button 4 it will clear the setting and reset the node. Please check the node_reset() function in main.c 

    Also, as I can see in the code how the config server handle node reset command from the client (check handle_node_reset() ). It will go through the following stages: 
    typedef enum
    {
    NODE_RESET_IDLE,
    NODE_RESET_PENDING,
    #if MESH_FEATURE_GATT_PROXY_ENABLED
    NODE_RESET_PENDING_PROXY,
    #endif /* MESH_FEATURE_GATT_PROXY_ENABLED */
    NODE_RESET_FLASHING,
    } node_reset_state_t;

    So if the flash module is busy it will stay in pending state(NODE_RESET_FLASHING) and wait for NRF_MESH_EVT_FLASH_STABLE event before it continue to the node reset process. 

  • Hi Hung,
     
    Thanks for your help. Just a quick question:  
    Imagine a simple custom node, containing a single element with a single instance of a proprietary/vendor model plus a configuration server and health server instance.
     
    Whats the recommend api to store a few hundred bytes of model related data in flash?
    model_config_file_xxx(), mesh_config_entry_xxx() flesh_manager_xxx()? Or something else?
     
    Are the model_config_file_xxx() api calls needed for vendor (i.e. non standard/generic) models? Would it be safe to omit those calls?  (please note: config and health server are running as well.)
    Right now, I'm using the mesh_config_entry_xxx() api - which works quite well, except the previously mention fault triggered by mesh_stack_config_clear(), which makes me think that I'm most likely using the wrong api for the job...

    Any clarification is welcome,
    Thank you,
    Michael.


     

  • Hi Michael, 
    The recommended way of storing data in mesh is to use your own file and entries in the file (mesh_config_entry_xxx). I would suggest to have a look at the enocean example where we store some custom data in a separated file. 

    Could you try to reproduce the issue with one of our example so we can test here ? 

Reply
  • Hi Michael, 
    The recommended way of storing data in mesh is to use your own file and entries in the file (mesh_config_entry_xxx). I would suggest to have a look at the enocean example where we store some custom data in a separated file. 

    Could you try to reproduce the issue with one of our example so we can test here ? 

Children
  • Hi Hung,

    Thanks for the fast reply! I'm using mesh_config_entry_xxx() api's right now, but they are causing some issues - especially when deleting the data during reset/unprovisioning.

    Could you try to reproduce the issue with one of our example so we can test here ? 

    Sure, but the problem is a little more complicated: The way it looks right now, the issue happens if the node is provisioned, but not fully set up. For example when the (Android) app starts provisioning, but the connection gets lost before an App key is assigned. Trying to reset the node by calling mesh_stack_config_clear() causes the firmware to get trapped in app_error_fault_handler() - basically brick-ing the device...

    I need some way to recover from it. Do you have any ideas how to prevent app_error_fault_handler() from making the node inaccessible/unresponsive?

  • Hi BlueMike, 

    You can use addr2line.exe tool (I got it from installing MinGW) to find what cause app_error_fault_handler() . You just need to input the .elf file and the line that throwing the error (in your case 0x0002E65E) and it will show you the file and the line of code that causing the issue. 

  • Hi Hung,

    again thanks for your help. Considering how quickly the softdevice ends up at app_error_fault_handler(), I assume that there's something fundamentally wrong. But let's start again at the beginning:

    1. The firmware initializes an input pin (E7: P0.05 - AIN3) to receive an event when a button is pushed.

    nrfx_gpiote_in_config_t configButtonA = NRFX_GPIOTE_CONFIG_IN_SENSE_TOGGLE(false);
    configButtonA.pull = NRF_GPIO_PIN_PULLUP;
    ERROR_CHECK(nrfx_gpiote_in_init(BUTTON_A_INPUT_PIN, &configButtonA, button_a_pushed_event));
    nrfx_gpiote_in_event_enable(BUTTON_A_INPUT_PIN, true); 

    2. The button's event handler just remembers the time the button is pushed:

    static void button_a_pushed_event(nrfx_gpiote_pin_t pin, nrf_gpiote_polarity_t action)
    {
        if (pin == BUTTON_A_INPUT_PIN) {
            int value = nrf_gpio_pin_read(pin);
            if (value) {
                reset_start_time = 0; // cancel reset
            } else {
                reset_start_time = get_uptime_in_milliseconds();
            }
        }
    }

    3. The main() function periodically polls the current time and passes it to handle_node_reset(), which performs the actual reset:

    int main(void)
    {
        initialize();
        start();
    
        while (true)
        {
            uint64_t uptime_in_milliseconds = calculate_uptime_in_milliseconds();
            handle_node_reset(uptime_in_milliseconds);
    
            handle_business_logic(uptime_in_milliseconds);
    
            nrf_delay_ms(50);
        }
    }
    
    

    @Hung: The firmware's business logic is working as intended, just resetting causes issues. Still, is this "busy-loop" in main() a safe way to handle the business logic and the reset? Do you see any potential issues here?

    4. handle_node_reset() performs the actual reset if the button remains pushed for a certain amount of time (RESET_DELAY below):

    void handle_node_reset(int64_t time_in_milliseconds)
    {
        if (reset_start_time == 0)
        {
            return;
        }
    
        if (time_in_milliseconds - reset_start_time > RESET_DELAY)
        {
            reset_start_time = 0; // we don't want to fall in here a second time before the reboot happens.
    
            if (mesh_stack_is_device_provisioned()) 
            {
                if (proxy_is_enabled()) 
                {
                    // Calling proxy_stop() also triggers app_error_fault_handler(), so disabled for now:
                    // proxy_stop(); 
                }
    
                mesh_stack_config_clear();
            }
    
            node_reset();
        }
    }



    When mesh_stack_config_clear() is executed, it ends up in app_error_fault_handler(). The call stack looks like this (Softdevice 132, 7.3.0 in debug build):

        File:                       Line:   Function:
        app_error_weak.c            79      void app_error_fault_handler(unsigned int id=0x00000000, unsigned int pc=0x00000000, unsigned int info=0x00000000)
        assertion_handler_weak.c    54      void mesh_assertion_handler(unsigned int pc=0x2000ecab)
        mesh_config.c               322     void backend_entry_evt_handler(const mesh_config_backend_evt_t* p_evt=0x2000edf0)
        mesh_config.c               372     void backend_evt_handler(const mesh_config_backend_evt_t* p_evt=0x0000003e)
        mesh_config_flashman_glue.c 132     void write_complete_cb(const flash_manager_t* p_manager=0x00000000, const fm_entry_t p_entry=0x2000ecab, enum result=FM_RESULT_SUCCESS (0))
        flash_manager.c             471     void end_action(action_t* p_action=0x20003cc0, const fm_entry_t* p_entry=0x00075008)
        flash_manager.c             773     _Bool process_action_queue()
        flash_manager.c             792     void flash_op_ended_callback(enum user=0x3e, const flash_operation_t* p_op=0x2000ecab, short unsigned int token=0x6a81)
        mesh_flash.c                159     _Bool send_end_events()
        bearer_event.c              380     _Bool bearer_event_handler()
        bearer_event.c              145     void QDEC_IRQHandler()
        bearer_event.c              303     void bearer_event_flag_set(unsigned int flag=0x0e5939a0)
        flash_manager.c             1090    void flash_manager_entry_commit(const fm_entry_t* p_entry=0x20003ccc)
        mesh_config_flashman_glue.c 371     uint32_t mesh_config_backend_record_write(mesh_config_backend_file_t* p_file=0x0ddfd680, const uint8_t* p_data=0x2000ef28, unsigned int length=0x00000005)
        mesh_config_backend.c       139     uint32_t mesh_config_backend_store(struct id={short unsigned int file=0x0000, short unsigned int record=0x0003}, const uint8_t* p_entry=0x2000ef28, unsigned int entry_len=0x00000005)
        mesh_config.c               157     uint32_t default_file_store(const mesh_config_entry_t* p_params=0x000475c4, struct id={short unsigned int file=0x0000, short unsigned int record=0x0003})
        mesh_config.c               195     void dirty_entries_process()
        mesh_config.c               234     uint32_t entry_store(const mesh_config_entry_params_t* p_params=0x000475c4, struct id={short unsigned int file=0x0000, short unsigned int record=0x0003}, const void* p_entry=0x2000efbc)
        mesh_config.c               510     uint32_t mesh_config_entry_set(struct id={short unsigned int file=0x0000, short unsigned int record=0x0003}, const void* p_entry=0x2000efbc)
        net_state.c                 462     void seqnum_block_allocate()
        net_state.c                 554     void net_state_reset()
        mesh_stack.c                272     void mesh_stack_config_clear()
                                            void handle_node_reset(long long int time_in_milliseconds=0x0000000000000003)
                                            void main()
    

    Note: The call stack was transcribed, not copied, so there could be some typos in there, sorry if there are.

    The question is, what can possibly cause the softdevice to end up in app_error_fault_handler() when calling mesh_stack_config_clear()?
    Any ideas are welcome.

  • Hi BlueMike, 
    Please use  addr2line.exe tool to check, it's easier than looking at the code and guess what could be wrong. Or you can send us your .elf file I can run it here. 

  • Hi Hung,

    in debug configuration, addr2line outputs:

    nrf5_SDK_for_Mesh_v5.0.0_src/mesh/core/src/mesh_config.c:322 (discriminator 1)

    Oddly enough the problem doesn't happen that often in debug configuration. In release builds, however, it does happen reliably.

    Also: Although the firmware gets trapped in app_error_fault_handler() when calling mesh_stack_config_clear(), the node is unprovisioned after rebooting it.

    Any ideas what could be wrong here, or how to fix it, are welcome. Thank you,
    Michael.

Related