Zephyr/NCS crashes with SPIM and BLE active

I'm attempting to troubleshoot a system crash when polling a sensor at higher ODRs while BLE is enabled. I'm using NCS v2.9.0 with the nRF54L15. I checked the thread stack allocation with thread analyzer and everything looks well under its limit. LA output also shows ample time between reads (see below).

Midway through polling data, the SPI bus goes down along with any active BLE connection or advertising, and the chip stops outputting logs. This is difficult to analyze directly because debug configs like CONFIG_DEBUG and CONFIG_STACK_SENTINELS solve the issue and allow the firmware to operate properly, despite the higher overhead. My code is minimal, mostly just a button work handler in main and a sensor library to fetch on interrupt trigger:

// sensor_manager.c

#define SENSOR_ODR 1920

static void lsm6dsv32x_trig_handler(const struct device *dev,
                                       const struct sensor_trigger *trig) {
    sensor_sample_fetch_chan(dev, SENSOR_CHAN_LSM6DSV32X_FIFO);
}

int sensor_manager_init(void) {
    const struct device *const lsm6dsv32x = DEVICE_DT_GET_ONE(st_lsm6dsv32x);
    struct sensor_trigger trig;

    trig.type = SENSOR_TRIG_DATA_READY;
    trig.chan = SENSOR_CHAN_LSM6DSV32X_FIFO;
    sensor_trigger_set(lsm6dsv32x, &trig, lsm6dsv32x_trig_handler);

    sensor_manager_toggle(false);

    return 0;
}

int sensor_manager_toggle(bool enable) {
    const struct device *const lsm6dsv32x = DEVICE_DT_GET_ONE(st_lsm6dsv32x);
    struct sensor_value odr_attr;
    odr_attr.val1 = (enable) ? SENSOR_ODR : 0;
    odr_attr.val2 = 0;

    sensor_attr_set(lsm6dsv32x, SENSOR_CHAN_ACCEL_XYZ, SENSOR_ATTR_SAMPLING_FREQUENCY, &odr_attr);
    sensor_attr_set(lsm6dsv32x, SENSOR_CHAN_GYRO_XYZ, SENSOR_ATTR_SAMPLING_FREQUENCY, &odr_attr);

    return 0;
}

// main.c

void button_long_press_cb(void) {
    LOG_INF("Long press detected");
    static bool is_live = false;
    is_live = !is_live;
    sensor_manager_toggle(is_live);
}

int main(void) {
    button_init();
    ble_manager_init();
    sensor_manager_init();
    return 0;
}

// FIFO fetch function

int lsm6dsv32x_fifo_fetch_output(const struct device *dev)
{
    // checking FIFO status bits, init variables, etc
    uint16_t num = fifo_status.fifo_level;

    while (num--) {
        lsm6dsv32x_fifo_out_raw_t f_data;

        if(lsm6dsv32x_fifo_out_raw_get(ctx, &f_data) != 0)
            return -1;
    }
}

What could be going wrong with my setup? I was able to poll at high rate (2 kHz) previously when I had FIFO batching disabled, so this doesn't seem to be related to resource constraints on the surface. 

  • Hello,

    Have you tried to debug the application to see where it crashes?

    Thanks,

    BR
    Kazi

  • Hi,

    Enabling debug configs resolves the problem, which unfortunately made it difficult to troubleshoot. SystemView would also overflow any time I attempted to run the program with the sensor active. Users on the Zephyr Discord server pointed me to the fact that this may be caused by deadlock, and that does seem to have been the root cause. At a high ODR (or a large data transfer), the system would be blocked by continuous interrupts, preventing it from handling any other threads. Since my sensor driver was already out-of-tree, I just updated it to receive bulk data from the sensor rather than initiating a SPI transaction for each sample in the sensor's FIFO buffer, and it works great now. In hindsight, a fetch call lasting 3-4ms probably wasn't too healthy for a system with the BLE stack and logging running on the same core.

Related