Bug in NCS v1.7.1 with 32kHz RC Oscillator and MCUboot enabled

Well, this one took me many hours of tracing to track down the root cause, so I hope it helps someone!

Note: I wasn't able to reproduce this directly with a sample included in the nRF Connect SDK. But it happens with my application, based on the "light_ctrl" mesh sample.

It occurs with the following 3-way combination (tested, only if all met);

  • CONFIG_CLOCK_CONTROL_NRF_K32SRC_RC=y
  • CONFIG_BOOTLOADER_MCUBOOT=y
  • nRF Connect SDK v1.7.1 (didn't occur with v1.4.2 I last used)

I also had to enable CONFIG_BT_LL_SW_SPLIT=y  for zephyr controller to be able to fit in the flash of the nRF52833DK I'm using, but not sure if this is related (yes I intend to move to the soft device and know it is now recommended).

To fix I've changed to CONFIG_CLOCK_CONTROL_NRF_K32SRC_SYNTH=y

I've found these two issues which appear to have similar failure logs;

https://github.com/zephyrproject-rtos/zephyr/pull/25583

https://devzone.nordicsemi.com/f/nordic-q-a/76899/zephyr-build-smp_svr-for-thingy52-fails

Build error log below

Kevin

Parents
  • I also filed a private support case with Nordic and got some more info;

    Okay, so it might not be due to a bug after all, but rather that the calibration routine for the 32kHz clock was changed (in nRFConnect SDk v1.6.0 I think). Basically, the 32kHz requires a calibration routine that depends on multithreading. So you will need to enable multithreading or disable the calibration in the sub-image that's not compiling. The temp_nrf5 files doesn't seem to handle calibration, but they do require multithreading, so please try enabling that in your application.

    I would think one of the configs CONFIG_CLOCK_CONTROL_NRF_K32SRC_RC_CALIBRATION=n or CONFIG_MULTITHREADING=y would be sufficient here

    I ended up solving with CONFIG_CLOCK_CONTROL_NRF_K32SRC_RC_CALIBRATION=n to avoid the size overhead of multithreading. However someone on the Zephyr bug does say this isn't recommended. I'm not sure why but will post there.

    Note: this only needs to be set for the MCUBOOT image, not the main image, i.e. adding to a \child_image\mcuboot.conf as per https://developer.nordicsemi.com/nRF_Connect_SDK/doc/latest/nrf/ug_multi_image.html which then merges with the default MCUBOOT configuration.

  • Yes, I think (although only intuitively) that the RC clock accuracy in MCUBoot shouldn't be an issue. But could be wrong. We've not noticed anything yet after a week or two.

    I certainly think what you found could solve it - but maybe not the root cause. I think the root cause is more likely to be the calibration routine as Nordic said this is what changed, and it not being designed to support single threaded applications.

    Similar to this ticket, but for things to do with mutex/locks as in the error above instead of sleep;

    devzone.nordicsemi.com/.../zephyr-build-smp_svr-for-thingy52-fails

  • "We've not noticed anything yet after a week or two."
    Thank you for sharing this helpful info, I'll probably also solve it this way.

    "The temp_nrf5 files doesn't seem to handle calibration"
    No, it does not directly, but the calibration code needs to sample temperature from nrf's internal temperature sensor, so that's the reason.

    Yes, the root cause is that the calibration now requires a multithreading support, but the dependency between the CONFIG_CLOCK_CONTROL_NRF_K32SRC_RC_CALIBRATION and CONFIG_MULTITHREADING config options is not defined in the code.
    It works in app builds because CONFIG_MULTITHREADING is turned on by default and it doesn't work in mcuboot builds because mcuboot's config explicitly disables it Slight smile

  • I am having the same problem. Any progress?

  • What exactly do you mean?
    The solution is clearly written in the last response + it has been fixed over 3 years ago.

Reply Children
  • Hi,

    Thank you for sharing a link to the thread with the issue you are currently facing. Please continue the discussion in that new thread.

    If you see any similarities between the issue you are currently facing and the issues discussed in this older thread, then please link to this thread from the new one, and describe what are the similarities.

    Regards,
    Terje