Secure fault when accessing sensor channel format in Sensor Model handler

I'm developing a Bluetooth mesh lighting controller on a custom board using nRF Connect SDK v2.6.1. I'm trying to add a Sensor Server model to report energy usage, but I'm encountering a crash that I believe is related to TrustZone memory access.

My application works fine when built for the secure target (e.g., my_custom_board). However, when I build for the non-secure target (my_custom_board_ns), the application crashes every time a client requests a sensor reading.

The sensor server is the same as in Nordic's mesh light ctrl sample app. I've also tested the sample app itself running on an nrf53 development kit. Again for the nrf5340dk_nrf5340_cpuapp_ns target the sample app doesn't work, but for the nrf5340dk_nrf5340_cpuapp target it does.

I may have just misunderstood the capabilities of the sensor server with _ns builds so please let me know if it's not possible to have a sensor server on a _ns target, but the documentation for the mesh light ctrl sample leads me to believe you can.

the error logs can be seen below. (I log some checks in the energy_use_get function which I've also included):

[00:02:07.970,886] <inf> model_handler: energy_use_get: energy_use_get: sensor=0x200314f4 type=0xb846c
[00:02:07.970,886] <inf> model_handler: energy_use_get: energy_use_get: channel_count=1 channels_ptr=0xb4808
[00:02:07.970,916] <inf> model_handler: energy_use_get: energy_use_get: chan=0xb4808 format=0xb4c38
[00:02:07.970,916] <inf> model_handler: energy_use_get: energy_use_get: format->size=4 user_data=0xa89f8 cb=0x20031cb4
[00:02:07.970,947] <inf> model_handler: energy_use_get: energy_use_get: cb struct at 0x20031cb4
[00:02:07.970,977] <inf> model_handler: energy_use_get: energy_use_get: success dummy=1
[00:02:07.971,038] <err> os: secure_fault: ***** SECURE FAULT *****
[00:02:07.971,038] <err> os: secure_fault:   Address: 0x4
[00:02:07.971,069] <err> os: secure_fault:   Attribution unit violation
[00:02:07.971,069] <err> os: esf_dump: r0/a1:  0x00000000  r1/a2:  0x20031534  r2/a3:  0x20037920
[00:02:07.971,099] <err> os: esf_dump: r3/a4:  0x000a3bdb r12/ip:  0x000a9468 r14/lr:  0x000a3ac5
[00:02:07.971,099] <err> os: esf_dump:  xpsr:  0x01000200
[00:02:07.971,130] <err> os: esf_dump: s[ 0]:  0x003e7d69  s[ 1]:  0x00010000  s[ 2]:  0x00010000  s[ 3]:  0x0008fb8f
[00:02:07.971,130] <err> os: esf_dump: s[ 4]:  0x0008fb6d  s[ 5]:  0x00000000  s[ 6]:  0x00000006  s[ 7]:  0x000a571b
[00:02:07.971,160] <err> os: esf_dump: s[ 8]:  0x200314f4  s[ 9]:  0x200379cc  s[10]:  0x200302b8  s[11]:  0x20037978
[00:02:07.971,160] <err> os: esf_dump: s[12]:  0x20037b68  s[13]:  0x000924cd  s[14]:  0x000a4ed1  s[15]:  0x000a4ed5
[00:02:07.971,191] <err> os: esf_dump: fpscr:  0x000a4ed1
[00:02:07.971,191] <err> os: esf_dump: Faulting instruction address (r15/pc): 0x000a383e
[00:02:07.971,221] <err> os: z_fatal_error: >>> ZEPHYR FATAL ERROR 41: Unknown error on CPU 0
[00:02:07.971,252] <err> os: z_fatal_error: Current thread: 0x20032f40 (BT RX)
[00:02:08.826,599] <err> fatal_error: k_sys_fatal_error_handler: Resetting system



Using addr2line, I've traced the fault to the following line inside the SDK file nrf/subsys/bluetooth/mesh/sensor_types.c:

const struct scalar_repr *repr = format->user_data;

I believe this suggests my non-secure application is trying to access a pointer (format->user_data) that points to a secure memory region, causing a secure fault.

Below is the relevant code for my Sensor Server implementation:

static int dummy_energy_use;

static int energy_use_get(struct bt_mesh_sensor_srv *srv,
                          struct bt_mesh_sensor *sensor,
                          struct bt_mesh_msg_ctx *ctx,
                          struct bt_mesh_sensor_value *rsp)
{
    /* aforementioned logging and checks */

    int64_t micro = (int64_t)dummy_energy_use * 1000000LL;

    int rc = bt_mesh_sensor_value_from_micro(sensor->type->channels[0].format, micro, rsp);
    if (rc) {
        LOG_ERR("bt_mesh_sensor_value_from_micro failed: %d", rc);
        return rc;
    }

    dummy_energy_use++;
    LOG_INF("energy_use_get: success dummy=%d", dummy_energy_use);
    return 0;
}

static const struct bt_mesh_sensor_descriptor energy_use_desc = {
    .tolerance = {
        .negative = 0,
        .positive = 0,
    },
    .sampling_type   = BT_MESH_SENSOR_SAMPLING_UNSPECIFIED,
    .period          = 0,
    .update_interval = 0,
};

static struct bt_mesh_sensor energy_use = {
    .type       = &bt_mesh_sensor_precise_tot_dev_energy_use,
    .get        = energy_use_get,
    .descriptor = &energy_use_desc,
};

static struct bt_mesh_sensor *const sensors[] = {
    &energy_use,
};

static struct bt_mesh_sensor_srv sensor_srv = BT_MESH_SENSOR_SRV_INIT(sensors, ARRAY_SIZE(sensors));

static struct bt_mesh_elem elements[] = {
    BT_MESH_ELEM(1,
                 BT_MESH_MODEL_LIST(BT_MESH_MODEL_CFG_SRV,
                                    BT_MESH_MODEL_HEALTH_SRV(&health_srv, &health_pub),
                                    BT_MESH_MODEL_LIGHTNESS_SRV(&my_ctx.lightness_srv),
                                    BT_MESH_MODEL_SCENE_SRV(&scene_srv),
                                    BT_MESH_MODEL_SENSOR_SRV(&sensor_srv)), // <-- Added model
                 BT_MESH_MODEL_NONE),
    /* ... other elements ... */
};
  • Don't suppose anybody's had a chance to look at this yet? I've been notified there's been a few different engineers assigned but no replies so far. Could really do with some help on this.

  • Hi,

    Steve Butler said:
    Could really do with some help on this.

    I am sorry for the delays. Have you gotten any further on your own?

    I may have just misunderstood the capabilities of the sensor server with _ns builds so please let me know if it's not possible to have a sensor server on a _ns target, but the documentation for the mesh light ctrl sample leads me to believe you can.

    Documentation indeed indicates that ns build targets should work. For later SDK releases it even recommends to use the ns variant of the board target, for more security. However, the nRF5340 is not listed as a supported board, and (at least in later SDK releases) the unmodified sample won't build for the DK.

    Using addr2line, I've traced the fault to the following line inside the SDK file nrf/subsys/bluetooth/mesh/sensor_types.c:

    const struct scalar_repr *repr = format->user_data;

    I believe this suggests my non-secure application is trying to access a pointer (format->user_data) that points to a secure memory region, causing a secure fault.

    I agree, your explanation sounds plausible, yes. If so, I would expect that the offending variable format came from the following call:

    int rc = bt_mesh_sensor_value_from_micro(sensor->type->channels[0].format, micro, rsp);

    Are you able to trigger the error there already, by trying to access sensor->type->channels[0].format directly, in that context? I.e. where you now call bt_mesh_sensor_value_from_micro in your code?

    Regards,
    Terje

  • Thanks for the response Terje,

    So I've updated the energy_use_get function to the below:

    static int energy_use_get(struct bt_mesh_sensor_srv   *srv,
                              struct bt_mesh_sensor       *sensor,
                              struct bt_mesh_msg_ctx      *ctx,
                              struct bt_mesh_sensor_value *rsp)
    {
        /* Report energy usage as dummy value, and increase it by one every time
         * a get callback is triggered. The logic and hardware for mesuring
         * the actual energy usage of the device should be implemented here.
         */
        // int64_t micro = (int64_t)dummy_energy_use * 1000000LL;
        const struct bt_mesh_sensor_format* fmt = sensor->type->channels[0].format;
        if (!fmt) {
            printf("bt_mesh_sensor_format: (null)\n");
            return;
        }
        printf("bt_mesh_sensor_format @%p {\n", (void *)fmt);
        printf("  cb        : %p\n", (void *)fmt->cb);
        printf("  user_data : %p\n", fmt->user_data);
        printf("  size      : %zu\n", fmt->size);
    #ifdef CONFIG_BT_MESH_SENSOR_LABELS
        printf("  unit      : %p\n", (void *)fmt->unit);
    #endif
        printf("}\n");
        // int rc = bt_mesh_sensor_value_from_micro(sensor->type->channels[0].format, micro, rsp);
        // if (rc)
        // {
        //     LOG_ERR("energy_use_get: bt_mesh_sensor_value_from_micro failed: %d", rc);
        //     return rc;
        // }
        dummy_energy_use++;
        LOG_INF("energy_use_get: success dummy=%d", dummy_energy_use);
        return 0;
    }

    And when I try get sensor information the console reads:


    Using addr2line again on 0x00074e8a it returns:
    <project-root>/external/nrf/subsys/bluetooth/mesh/sensor.c:55

    line 55 is:

    if (fmt->cb->compare(&threshold->range.high,
  • Is there any chance of a response on this? Again I've received a few notifications that engineers have been assigned but no replies as of yet

  • calling the following function:
    void print_sensor_format(const struct bt_mesh_sensor_format *fmt)
    {
        if (!fmt)
        {
            printf("bt_mesh_sensor_format: (null)\n");
            return;
        }
        printf("bt_mesh_sensor_format @%p {\n", (void *)fmt);
    
        printf("\tcb: %p\n", (void *)fmt->cb);
        printf("\tcb->to_float: %p\n", (void *)fmt->cb->to_float);
        printf("\tcb->to_micro: %p\n", (void *)fmt->cb->to_micro);
        printf("\tcb->to_string: %p\n", (void *)fmt->cb->to_string);
        printf("\tcb->compare: %p\n", (void *)fmt->cb->compare);
    
        printf("\tuser_data: %p\n", fmt->user_data);
        printf("\tsize: %zu\n", fmt->size);
        printf("}\n");
    }
    within the energy_use_get function returns the following:
    bt_mesh_sensor_format @0x7959c {
            cb: 0x20031b00
            cb->to_float: 0x749f3
            cb->to_micro: 0x63b31
            cb->to_string: 0
            cb->compare: 0x749a7
            user_data: 0x78068
            size: 4
    }
    now my partitions.yml contains the below:
    app:
      address: 0x40000
      end_address: 0xf0000
      region: flash_primary
      size: 0xb0000
    sram_nonsecure:
      address: 0x20030000
      end_address: 0x20080000
      orig_span: &id002
      - sram_primary
      - rpmsg_nrf53_sram
      region: sram_primary
      size: 0x50000
      span: *id002
    which suggests to me that fmt, cb and compare are all in non_secure memory. Therefore at the line where the fault occurs:

    if (fmt->cb->compare(&threshold->range.high,

    it will be &threshold->range.high that triggers the secure fault?
Related