Firmware Reset failing with MCUboot and WDT

Hi all,

We have an application which uses both MCUboot and a watchdog timer. This combination seems to be causing issues anytime we attempt a soft reset using sys_reboot or NVIC reset directly.

Our system;

  • Custom board using nRF5340
  • ncs v2.2.0

When we attempt a soft reset the following happens;

  1. Device powers down and enters bg_thread_main() in "ncs\v2.2.0\zephyr\kernel\init.c", it doesn't enter our application main()
  2. Device enters MCUboot
  3. After sometime the watchdog resets, the device enters MCUboot again
  4. Finally the device enters bg_thread_main() again, this time our application main() runs successfully

All this is monitored using UART log (shown below).

This issue looks similar to another devzone post, however, we've tried the suggestion of adding CONFIG_BOOT_WATCHDOG_FEED to our mcuboot.conf but we still get the same sequence of events.

From the RESET table, software resets don't reset the WDT which explains why we get 2 resets but it doesn't explain why we don't enter our application main loop in 1. We've tried forcing a WDT timer reset instead of a soft reset and this does fix the issue - we only get one reset and it enters our application main loop successfully. 

One solution is if we tie a gpio directly to the RESET pin and use this gpio for resets from within the application. This isn't ideal as it requires hardware changes and limits ours design.

Does anyone have any other suggestions we can try?

Parents
  • Hello,

    The CONFIG_BOOT_WATCHDOG_FEED symbol should be selected by default when the build target is a nRF device. You can confirm this by if the symbol is set in the generated configuration file (build/mcuboot/zephyr/.conf). Either way, I would suggest that you try to make the WD timeout longer (e.g., 30 seconds), if you have not done so already. Then debug the device to see where it hangs when the program isn't reaching main().

    Thanks,

    Vidar

  • Hi Vidar,

    Thanks for your reply.

    You're right that the "CONFIG_BOOT_WATCHDOG_FEED" option is included by default in the mcuboot configuration.


    From what I can tell, it seems the application is hanging on the main() function inside bg_thread_main(). All the functions called before main() appear to be exiting without any issues.

  • I'm sorry for the delayed response. I've been working on debugging the code to identify what is triggering the SPU event. It turned out to be quite challenging because the security violation is not caused by the CPU but by another bus master.

    I have not been able to pinpoint the source of the flash access error, nor have I found any other reports of this event happening under similar conditions. However, upgrading to SDK v2.4.4 may resolve this issue as it includes several errata workarounds that were not present in SDK v2.2.0.

    A workaround for now may be to issue a soft reset from the SPU ISR in TF-M to avoid having to wait for the WDT timeout. For example, by adding the following to ncs/v2.2.0/modules/tee/tf-m/trusted-firmware-m:

    diff --git a/platform/ext/common/faults.c b/platform/ext/common/faults.c
    index eb87c971c..9124030c2 100644
    --- a/platform/ext/common/faults.c
    +++ b/platform/ext/common/faults.c
    @@ -107,3 +107,8 @@ __attribute__((naked)) void UsageFault_Handler(void)
             "b         .                      \n"
         );
     }
    +
    +void SPU_IRQHandler(void)
    +{
    +    NVIC_SystemReset();
    +}
    \ No newline at end of file

    hugzy123 said:
    It seems like because the reset pin never goes low again it doesn't complete a reset. You can see this using the "Boot Reset" button on the DK; if I hold the button down the device won't reset, It's only when I release the button does it reset.

    I think you are right. The specification does not guarantee the pin state in reset, only when going out of reset. I know this approach with using a GPIO to trigger pinreset worked with the nRF52840 but there are some differences in the reset mechanism between the nRF52840 and the nRF5340. 

  • Thanks for your reply.

    I have not been able to pinpoint the source of the flash access error, nor have I found any other reports of this event happening under similar conditions. However, upgrading to SDK v2.4.4 may resolve this issue as it includes several errata workarounds that were not present in SDK v2.2.0.

    Upgrading to a newer SDK does sound like a better fix, however, we've been struggling to get the partitions to confirm to the alignment rules in the newer SDK versions for the trusted firmware module.

    A workaround for now may be to issue a soft reset from the SPU ISR in TF-M to avoid having to wait for the WDT timeout. For example, by adding the following to ncs/v2.2.0/modules/tee/tf-m/trusted-firmware-m:

    Thanks, we've verified this workaround in our current firmware. Do you know if there are any downsides to using this patch? 

    I think you are right. The specification does not guarantee the pin state in reset, only when going out of reset. I know this approach with using a GPIO to trigger pinreset worked with the nRF52840 but there are some differences in the reset mechanism between the nRF52840 and the nRF5340. 

    We're looking into adding an IC we can use which would generate a pulse on the pinreset line from an output GPIO of the nRF module.

  • hugzy123 said:
    Upgrading to a newer SDK does sound like a better fix, however, we've been struggling to get the partitions to confirm to the alignment rules in the newer SDK versions for the trusted firmware module.

    Did you use the same memory layout in both versions? The SPU requires 32K aligment: https://docs.nordicsemi.com/bundle/ncs-latest/page/nrf/security/tfm.html#tf-m_partition_alignment_requirements.

    hugzy123 said:
    Thanks, we've verified this workaround in our current firmware. Do you know if there are any downsides to using this patch? 

    I do not foresee any problems the workaround itself. 

  • Did you use the same memory layout in both versions? The SPU requires 32K aligment: https://docs.nordicsemi.com/bundle/ncs-latest/page/nrf/security/tfm.html#tf-m_partition_alignment_requirements.

    Thank you, I will look into this.

    I do not foresee any problems the workaround itself

    Great, we'll use this for now.

  • Hi  ,

    Just wanted to double check something, does the SPU definitely require 32K alignment?

    The only reason I ask is because, in my ncs v2.4.4 build I can see

    #define CONFIG_NRF_SPU_FLASH_REGION_SIZE 0x4000 
    which would be 16K.
    I'm assuming CONFIG_NRF_SPU_FLASH_REGION_SIZE would be what is used for CONFIG_NRF_TRUSTZONE_FLASH_REGION_SIZE in https://docs.nordicsemi.com/bundle/ncs-latest/page/nrf/security/tfm.html#tf-m_partition_alignment_requirements, i'm just not sure if this would be 16k or 32k,
    Thanks,
     
Reply Children
Related