CONFIG_LOG=n Causes crash

SDK NCS 2.6.1: TF-M v1.7.x updated to SDK NCS 3.2.4: TF-M v2.2.0

What changed in TF-M 2.x that broke CONFIG_LOG=n

In TF-M 1.x (your old NCS 2.6.1), TF-M fully completed its secure initialization and then handed off to the NS app. The NS app started in a clean, fully-initialized TrustZone
environment.

In TF-M 2.x, TF-M's secure thread runs concurrently with the NS app — the handoff happens earlier. This means the NS app's SYS_INIT sequence now runs while TF-M's secure thread is
also active. To protect NS→S PSA API calls, TF-M 2.x introduced an NS interface mutex, initialized at POST_KERNEL priority 40.

If anything in the NS app calls a PSA API (crypto, entropy, etc.) before that mutex is initialized at P_40, it corrupts TF-M's NS call stack → BusFault → double-fault lockup.

With CONFIG_LOG=y, log_core occupies POST_KERNEL P_0, which appears to delay or serialize whatever was racing to call PSA APIs before P_40.

With CONFIG_LOG=n, log_core is gone from the init sequence, and something at POST_KERNEL P_0 (likely the nRF-WiFi shim work queue or entropy driver) calls into TF-M before the NS
interface mutex exists.

This was a non-issue on NCS 2.6.1 because TF-M 1.x didn't have the concurrent thread model — the NS interface was always ready before the NS app even started.

Parents
  • Hi,

     

    I am not sure I fully follow the issue at hand here.

    TF-M initialises first, then jumps to the non-secure application.

    PSA calls use non-secure callable functions to access the secure image, in this case TF-M.

     

    If you see a crash when logging is disabled, it indicates a timing based issue.

    Could you share information related to the crash? For instance a assert message / fault log?

     

    Kind regards,

    Håkon

  • The problem is that if I place the target into debug mode to problem goes away, so I am trying the following to see based on the following.

    Root Cause: Init Priority Race in TF-M 2.x

    What changed: In TF-M 1.x, the S-side was fully quiescent before handing off to NS. In TF-M 2.x, the NS Agent model means the NS world starts earlier — but critically, tfm_ns_interface_init() (which sets up the NS-side mutex/IPC glue) now runs as a SYS_INIT hook at POST_KERNEL level.

    The specific race: The CONFIG_ENTROPY_GENERATOR=y pulls in entropy_psa_crypto driver, which calls psa_crypto_init() during its own SYS_INIT. If that fires before tfm_ns_interface_init() completes, the PSA veneer call goes over TZ with an uninitialised NS interface → BusFault.

    CONFIG_LOG=y hides it: Deferred log mode spins up the log processing thread and adds kernel scheduling overhead, creating just enough slack that tfm_ns_interface_init() wins the race. With CONFIG_LOG=n the NS init path sprints ahead.

    I am going to test the following:

    Fixes (in order of preference)

    Fix 1 — Delay entropy init past the NS interface (cleanest):

    # prj.conf — check if this Kconfig exists in your NCS version:
    CONFIG_ENTROPY_PSA_CRYPTO_INIT_PRIORITY=90

    The NS interface init priority is typically ~50; setting entropy to 90 ensures ordering.

    Fix 2 — Explicit early barrier (defensive, works regardless):
    Add a SYS_INIT at POST_KERNEL level 0 that spin-polls psa_framework_version():

    static int wait_for_tfm_ready(void)
    {
    /* psa_framework_version() returns PSA_FRAMEWORK_VERSION (0x0100) when
    * the S-side is ready; returns 0 on error. Spin briefly. */
    for (int i = 0; i < 1000; i++) {
    if (psa_framework_version() != 0) return 0;
    k_busy_wait(100);
    }
    __ASSERT(false, "TF-M did not become ready");
    return -ETIMEDOUT;
    }
    SYS_INIT(wait_for_tfm_ready, POST_KERNEL, 0);

    Will advise on outcome

Reply
  • The problem is that if I place the target into debug mode to problem goes away, so I am trying the following to see based on the following.

    Root Cause: Init Priority Race in TF-M 2.x

    What changed: In TF-M 1.x, the S-side was fully quiescent before handing off to NS. In TF-M 2.x, the NS Agent model means the NS world starts earlier — but critically, tfm_ns_interface_init() (which sets up the NS-side mutex/IPC glue) now runs as a SYS_INIT hook at POST_KERNEL level.

    The specific race: The CONFIG_ENTROPY_GENERATOR=y pulls in entropy_psa_crypto driver, which calls psa_crypto_init() during its own SYS_INIT. If that fires before tfm_ns_interface_init() completes, the PSA veneer call goes over TZ with an uninitialised NS interface → BusFault.

    CONFIG_LOG=y hides it: Deferred log mode spins up the log processing thread and adds kernel scheduling overhead, creating just enough slack that tfm_ns_interface_init() wins the race. With CONFIG_LOG=n the NS init path sprints ahead.

    I am going to test the following:

    Fixes (in order of preference)

    Fix 1 — Delay entropy init past the NS interface (cleanest):

    # prj.conf — check if this Kconfig exists in your NCS version:
    CONFIG_ENTROPY_PSA_CRYPTO_INIT_PRIORITY=90

    The NS interface init priority is typically ~50; setting entropy to 90 ensures ordering.

    Fix 2 — Explicit early barrier (defensive, works regardless):
    Add a SYS_INIT at POST_KERNEL level 0 that spin-polls psa_framework_version():

    static int wait_for_tfm_ready(void)
    {
    /* psa_framework_version() returns PSA_FRAMEWORK_VERSION (0x0100) when
    * the S-side is ready; returns 0 on error. Spin briefly. */
    for (int i = 0; i < 1000; i++) {
    if (psa_framework_version() != 0) return 0;
    k_busy_wait(100);
    }
    __ASSERT(false, "TF-M did not become ready");
    return -ETIMEDOUT;
    }
    SYS_INIT(wait_for_tfm_ready, POST_KERNEL, 0);

    Will advise on outcome

Children
No Data
Related