Trouble viewing hard faults

I'm trying to view core dumps on Memfault's web app, and having trouble doing so. I can't even see hard faults being logged even though I know they're occurring (forcing stack overflow, dereferincing NULL pointer). I was wondering if there's something I've enabled in my prj.conf that's preventing this? My set up is NCS v1.7.0 on a custom board with a nRF9160. Here's my prj.conf

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
##################
# General config #
##################
CONFIG_APP_VERSION="1.0.10test"
CONFIG_REBOOT=y
# Reboot on Error (Change this to 'n' when debugging lockups etc.)
CONFIG_RESET_ON_FATAL_ERROR=y
CONFIG_ASSERT=n
CONFIG_BT_ASSERT=n
# CJSON
CONFIG_CJSON_LIB=y
# Base 64 library (use to compress thingname)
CONFIG_BASE64=y
# NEWLIB C
CONFIG_NEWLIB_LIBC=y
CONFIG_NEWLIB_LIBC_FLOAT_PRINTF=y
# Date Time library
CONFIG_DATE_TIME=y
CONFIG_DATE_TIME_NTP=y
CONFIG_DATE_TIME_UPDATE_INTERVAL_SECONDS=0
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Parents
  • I ended up figuring it out. After some help from Memfault support, I got pointed in the right direction

    "Ah, as far as the weird behavior around coredump capture, you've unfortunately run into a bug that's been since fixed in the Memfault SDK. You'll have to update the Memfault SDK module, here's some instructions:
    https://docs.memfault.com/docs/mcu/nrf-connect-sdk-guide/#updating-the-memfault-sdk

    The fix was made in v0.30.1 of the Memfault SDK:
    https://github.com/memfault/memfault-firmware-sdk/blob/36a634536d2cd3c455545f0b97c926275772153c/CHANGES.md#changes-between-memfault-sdk-0301-and-sdk-0300---april-6-2022"

    So if you're having the same issue, follow the steps there.

    Also, another extremely nuanced error that I experienced once I fixed that. The coredump was getting generated and saved successfully to the 9160's internal flash; however, upon reboot, the LWM2M library was exhibiting undefined behavior when connecting to the network (i.e. LWM2M_EVENT_BOOTSTRAPPED then LWM2M_EVENT_DISCONNECTED over and over). I remembered a limitation of the library listed here Requirements and application limitations — nRF Connect SDK 1.7.0 documentation (nordicsemi.com). Specifically,

    "The LwM2M carrier library uses the following NVS record key range: 0xCA00 - 0xCAFF. This range must not be used by the application."

    After looking at partitions.yml and lwm2m_os.c, I could see that the partition generated for the coredump was in the same range as the storage partition listed in my board file, and lwm2m_os.c was using the storage partition for its file system. I'm a little confused as to why LWM2M is using this old partition from the board file, since I was under the impression that those definitions were deprecated in favor of the partition manager. Anyway, sure it's just a remnant of transitioning. Hope this helps someone else.

    Edit: It'd also be nice if you could help me configure this partition file since it is a challenge. I actually discovered that the Settings storage is configured to go over the LWM2M partition as well which could lead to some weird behavior. Here's my dynamic partition file. I already have a static configuration file for my external flash but need a little help altering the primary flash partitions.

    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    alerts:
    address: 0x2fc000
    device: MX25L3
    end_address: 0x31c000
    placement:
    after:
    - biometrics
    region: external_flash
    size: 0x20000
    app:
    address: 0x1c200
    end_address: 0xee000
    region: flash_primary
    size: 0xd1e00
    biometrics:
    address: 0xfc000
    device: MX25L3
    end_address: 0x2fc000
    placement:
    after:
    - indices
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

  • Hi, and sorry for the late reply.

    First, it is good to hear that you managed to solve your original problem.

    Your new problem is as you say caused by the carrier_lib not using the Partition Manager. This has been changed in version 0.30.0.

    https://developer.nordicsemi.com/nRF_Connect_SDK/doc/latest/nrf/libraries/bin/lwm2m_carrier/CHANGELOG.html#liblwm2m-carrier-0-30-0

    In the changelog, it is also explained how you can make the Partition Manager partition match the device tree partition:

    To use the legacy NVS partition, you can add a pm_static.yml file to your project with the following content:

    Fullscreen
    1
    2
    3
    4
    5
    6
    lwm2m_carrier:
    address: 0xfa000
    size: 0x3000
    free:
    address: 0xfd000
    size: 0x3000
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

    Best regards,

    Didrik

  • I've been having trouble updating devices via FOTAs due to the partitions already in place on those devices. I can probably figure out how to swing this. Is there a way to update the carrier lib without updating the version of NCS that I'm using? Also, I know there's a compatibility matrix for what MFW is approved by various carriers, but is there a compatibility matrix for what version of NCS works with which version of the MFW? It'd be great to upgrade from 1.7.0 to 2.x.x. We got our devices certified by Verizon on MFW v1.3.0

  • These settings should work for the flash_primary region:

    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    mcuboot:
    address: 0x0
    end_address: 0xc000
    placement:
    before:
    - mcuboot_primary
    region: flash_primary
    size: 0xc000
    mcuboot_primary:
    address: 0xc000
    end_address: 0xee000
    orig_span: &id001
    - spm
    - app
    - mcuboot_pad
    region: flash_primary
    size: 0xe2000
    span: *id001
    mcuboot_pad:
    address: 0xc000
    end_address: 0xc200
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

    (note that I have not been able to actually test them myself)

    I had to shrink the memfault_storage partition to make room for the lwm2m_carrier partition. To make sure that the Memfault library uses the new size, you should probably also set CONFIG_PM_PARTITION_SIZE_MEMFAULT_STORAGE=0xc000 in your prj.conf. Shrinking the partition should not cause any problems, bu you might get less coredump data sent to memfault. You can avoid that by storing the coredump in no init RAM instead, similar to how it is done in this commit: https://github.com/simensrostad/fw-nrfconnect-nrf/commit/a366a615b16d93a961b94e9aec3873036d38c61e#

    esisk said:
    is there a compatibility matrix for what version of NCS works with which version of the MFW?

    This matrix shows which NCS version the various modem FW versions have been tested with: https://infocenter.nordicsemi.com/topic/comp_matrix_nrf9160/COMP/nrf9160/nrf9160_modem_fw.html

    However, when dealing with carrier certifications (and especially Verizon), you have to use the combinations listed here: https://infocenter.nordicsemi.com/topic/comp_matrix_nrf9160/COMP/nrf9160/nrf9160_operator_certifications.html

Reply
  • These settings should work for the flash_primary region:

    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    mcuboot:
    address: 0x0
    end_address: 0xc000
    placement:
    before:
    - mcuboot_primary
    region: flash_primary
    size: 0xc000
    mcuboot_primary:
    address: 0xc000
    end_address: 0xee000
    orig_span: &id001
    - spm
    - app
    - mcuboot_pad
    region: flash_primary
    size: 0xe2000
    span: *id001
    mcuboot_pad:
    address: 0xc000
    end_address: 0xc200
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

    (note that I have not been able to actually test them myself)

    I had to shrink the memfault_storage partition to make room for the lwm2m_carrier partition. To make sure that the Memfault library uses the new size, you should probably also set CONFIG_PM_PARTITION_SIZE_MEMFAULT_STORAGE=0xc000 in your prj.conf. Shrinking the partition should not cause any problems, bu you might get less coredump data sent to memfault. You can avoid that by storing the coredump in no init RAM instead, similar to how it is done in this commit: https://github.com/simensrostad/fw-nrfconnect-nrf/commit/a366a615b16d93a961b94e9aec3873036d38c61e#

    esisk said:
    is there a compatibility matrix for what version of NCS works with which version of the MFW?

    This matrix shows which NCS version the various modem FW versions have been tested with: https://infocenter.nordicsemi.com/topic/comp_matrix_nrf9160/COMP/nrf9160/nrf9160_modem_fw.html

    However, when dealing with carrier certifications (and especially Verizon), you have to use the combinations listed here: https://infocenter.nordicsemi.com/topic/comp_matrix_nrf9160/COMP/nrf9160/nrf9160_operator_certifications.html

Children
No Data