TF-M crash with certain UART baud rate settings in nRF9160

Dear Nordic Team!

We are struggling with a fatal error occurring during boot phase before entering the application main. When setting the baud rate of UART2 to 38400 or higher, the SPU detects some kind of violation, leading to SPU fault. This also occurs when we copy the device tree in the attachment and some of our prj.conf contents to the lwm2m_client example project. This occurs in both SDK 2.4.2 and 2.5.99-dev1. When running at a baud rate below 38400, the application works as intended and UART communication is ok on UART2.

*** Booting nRF Connect SDK v2.5.99-dev1 ***
I: Starting bootloader
I: Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
I: Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
I: Boot source: none
I: Image index: 0, Swap type: none
I: Bootloader chainload address offset: 0x10000
FATAL ERROR: Platform Exceptionlot
Platform Exception: SPU Fault

The prj.conf lines, which we added to cause the error.

# LTE link control
CONFIG_LTE_LINK_CONTROL=y

# Modem info
CONFIG_MODEM_INFO=y

# Enable settings storage
CONFIG_SETTINGS=y
CONFIG_FCB=y
CONFIG_SETTINGS_FCB=y
CONFIG_FLASH_MAP=y
CONFIG_STREAM_FLASH=y

# Allow FOTA downloads using download-client
CONFIG_LWM2M_CLIENT_UTILS=y
CONFIG_LWM2M_CLIENT_UTILS_FIRMWARE_UPDATE_OBJ_SUPPORT=y
CONFIG_DOWNLOAD_CLIENT=y
CONFIG_DOWNLOAD_CLIENT_STACK_SIZE=4096
CONFIG_DOWNLOAD_CLIENT_HTTP_FRAG_SIZE_1024=y
CONFIG_FOTA_DOWNLOAD=y

We cannot find out how the fatal error could be caused by the UART baud rate setting. Does UART in some way interfere with the boot process?

Thank you

pcb_v70_common.dts

Parents
  • Hello, 

    Are you using a custom board? Do you see the same behavior on the nRF9160DK or Thingy:91? What are you using UART2 for? 

    Please provide more details for me to reproduce.

    Kind regards,
    Øyvind

  • Yes we are using a custom board, and it also happens when we flash the binary to the Nordic nRF9160 Devkit. We created a custom Devicetree, which is attached in the initial post. UART2 is intended to be used for inter-processor communication with a Silicon Labs Bluetooth device.

  • Hello, 
    I'm not seeing the SPU fault on my side. I do however see the following errors:

    uart:~$ *** Booting Zephyr OS build v3.3.99-ncs1-1 ***
    
    
    *** Booting Zephyr OS build v3.3.99-ncs1-1 ***
    [00:00:00.255,126] <dbg> net_lwm2m_engine: lwm2m_engine_init: LWM2M engine socket receive thread started
    [00:00:00.255,218] <dbg> net_lwm2m_obj_security: security_create: Create LWM2M security instance: 0
    [00:00:00.255,279] <dbg> net_lwm2m_obj_server: server_create: Create LWM2M server instance: 0
    [00:00:00.255,401] <dbg> net_lwm2m_obj_device: device_create: Create LWM2M device instance: 0
    [00:00:00.255,462] <dbg> net_lwm2m_obj_conn_mon: connmon_create: Create LWM2M connectivity monitoring instance: 0
    [00:00:00.255,523] <dbg> net_lwm2m_obj_firmware: firmware_create: Create LWM2M firmware instance: 0
    [00:00:00.255,554] <dbg> net_lwm2m_obj_location: location_create: Create Location instance: 0
    [00:00:00.255,767] <inf> net_config: Initializing network
    [00:00:00.255,767] <inf> net_config: Waiting interface 1 (0x2000d488) to be up...
    [00:00:30.256,744] <err> net_config: Timeout while waiting network interface
    [00:00:30.256,774] <err> net_config: Network initialization failed (-115)
    [00:00:30.256,805] <inf> net_lwm2m_client_app: Run LWM2M client
    [00:00:30.256,805] <dbg> net_lwm2m_registry: lwm2m_engine_set: path:0/0/0, buf:0x49473, len:21
    [00:00:30.256,866] <dbg> net_lwm2m_registry: lwm2m_engine_set: path:0/0/2, buf:0x2001960f, len:1
    [00:00:30.256,896] <dbg> net_lwm2m_registry: lwm2m_engine_set: path:0/0/10, buf:0x2001960e, len:2
    [00:00:30.256,927] <dbg> net_lwm2m_registry: lwm2m_engine_set: path:1/0/0, buf:0x2001960e, len:2
    [00:00:30.257,263] <dbg> net_ipso_temp_sensor: temp_sensor_create: Create IPSO Temperature Sensor instance: 0
    [00:00:30.257,324] <dbg> net_ipso_light_control: light_control_create: Create IPSO Light Control instance: 0
    [00:00:30.257,446] <dbg> net_ipso_timer: timer_inst_create: Create IPSO Timer instance: 0
    [00:00:30.257,598] <inf> net_lwm2m_rd_client: Start LWM2M Client: nrf9160dk_nrf9160
    [00:00:30.258,148] <dbg> net_lwm2m_registry: lwm2m_engine_set: path:3303/0/5700, buf:0x20019de8, len:8
    [00:00:30.758,422] <dbg> net_lwm2m_registry: lwm2m_engine_get: path:0/0/1/0, level 3, buf:0x200188bf, buflen:1
    [00:00:30.758,483] <dbg> net_lwm2m_registry: lwm2m_engine_get: path:0/0/10/0, level 3, buf:0x200188ee, buflen:2
    [00:00:30.758,514] <dbg> net_lwm2m_registry: lwm2m_engine_get: path:1/0/1/0, level 3, buf:0x200188a8, buflen:4
    [00:00:30.758,575] <inf> net_lwm2m_rd_client: RD Client started with endpoint 'nrf9160dk_nrf9160' with client lifetime 30
    [00:00:30.758,636] <dbg> net_lwm2m_message_handling: lwm2m_parse_peerinfo: Parse url: coap://192.0.2.2:5683
    [00:00:30.758,728] <err> net_lwm2m_engine: Failed to create socket: 23
    [00:00:30.758,758] <err> net_lwm2m_rd_client: Cannot init LWM2M engine (-23)
    [00:00:31.758,972] <dbg> net_lwm2m_registry: lwm2m_engine_get: path:0/0/1/0, level 3, buf:0x200188bf, buflen:1
    [00:00:31.759,002] <dbg> net_lwm2m_registry: lwm2m_engine_get: path:0/0/10/0, level 3, buf:0x200188ee, buflen:2
    [00:00:31.759,033] <dbg> net_lwm2m_registry: lwm2m_engine_get: path:1/0/1/0, level 3, buf:0x200188a8, buflen:4
    [00:00:31.759,094] <inf> net_lwm2m_rd_client: RD Client started with endpoint 'nrf9160dk_nrf9160' with client lifetime 30
    [00:00:31.759,155] <dbg> net_lwm2m_message_handling: lwm2m_parse_peerinfo: Parse url: coap://192.0.2.2:5683
    [00:00:31.759,277] <err> net_lwm2m_engine: Failed to create socket: 23
    [00:00:31.759,277] <err> net_lwm2m_rd_client: Cannot init LWM2M engine (-23)
    [00:00:33.259,582] <dbg> net_lwm2m_registry: lwm2m_engine_get: path:0/0/1/0, level 3, buf:0x200188bf, buflen:1
    [00:00:33.259,643] <dbg> net_lwm2m_registry: lwm2m_engine_get: path:0/0/10/0, level 3, buf:0x200188ee, buflen:2
    [00:00:33.259,674] <dbg> net_lwm2m_registry: lwm2m_engine_get: path:1/0/1/0, level 3, buf:0x200188a8, buflen:4
    [00:00:33.259,735] <inf> net_lwm2m_rd_client: RD Client started with endpoint 'nrf9160dk_nrf9160' with client lifetime 30
    [00:00:33.259,765] <dbg> net_lwm2m_message_handling: lwm2m_parse_peerinfo: Parse url: coap://192.0.2.2:5683
    [00:00:33.259,887] <err> net_lwm2m_engine: Failed to create socket: 23
    [00:00:33.259,887] <err> net_lwm2m_rd_client: Cannot init LWM2M engine (-23)
    [00:00:35.760,375] <dbg> net_lwm2m_registry: lwm2m_engine_get: path:0/0/1/0, level 3, buf:0x200188bf, buflen:1
    [00:00:35.760,437] <dbg> net_lwm2m_registry: lwm2m_engine_get: path:0/0/10/0, level 3, buf:0x200188ee, buflen:2
    [00:00:35.760,467] <dbg> net_lwm2m_registry: lwm2m_engine_get: path:1/0/1/0, level 3, buf:0x200188a8, buflen:4
    [00:00:35.760,528] <inf> net_lwm2m_rd_client: RD Client started with endpoint 'nrf9160dk_nrf9160' with client lifetime 30
    [00:00:35.760,559] <dbg> net_lwm2m_message_handling: lwm2m_parse_peerinfo: Parse url: coap://192.0.2.2:5683
    [00:00:35.760,681] <err> net_lwm2m_engine: Failed to create socket: 23
    [00:00:35.760,681] <err> net_lwm2m_rd_client: Cannot init LWM2M engine (-23)
    [00:00:40.261,657] <dbg> net_lwm2m_registry: lwm2m_engine_get: path:0/0/1/0, level 3, buf:0x200188bf, buflen:1
    [00:00:40.261,718] <dbg> net_lwm2m_registry: lwm2m_engine_get: path:0/0/10/0, level 3, buf:0x200188ee, buflen:2
    [00:00:40.261,749] <dbg> net_lwm2m_registry: lwm2m_engine_get: path:1/0/1/0, level 3, buf:0x200188a8, buflen:4
    [00:00:40.261,779] <inf> net_lwm2m_rd_client: RD Client started with endpoint 'nrf9160dk_nrf9160' with client lifetime 30
    [00:00:40.261,840] <dbg> net_lwm2m_message_handling: lwm2m_parse_peerinfo: Parse url: coap://192.0.2.2:5683
    [00:00:40.261,962] <err> net_lwm2m_engine: Failed to create socket: 23
    [00:00:40.261,962] <err> net_lwm2m_rd_client: Cannot init LWM2M engine (-23)
    [00:00:48.763,580] <dbg> net_lwm2m_registry: lwm2m_engine_get: path:0/0/1/0, level 3, buf:0x200188bf, buflen:1
    [00:00:48.763,641] <dbg> net_lwm2m_registry: lwm2m_engine_get: path:0/0/10/0, level 3, buf:0x200188ee, buflen:2
    [00:00:48.763,671] <dbg> net_lwm2m_registry: lwm2m_engine_get: path:1/0/1/0, level 3, buf:0x200188a8, buflen:4
    [00:00:48.763,732] <inf> net_lwm2m_rd_client: RD Client started with endpoint 'nrf9160dk_nrf9160' with client lifetime 30
    [00:00:48.763,763] <dbg> net_lwm2m_message_handling: lwm2m_parse_peerinfo: Parse url: coap://192.0.2.2:5683
    [00:00:48.763,885] <err> net_lwm2m_engine: Failed to create socket: 23
    [00:00:48.763,885] <err> net_lwm2m_rd_client: Cannot init LWM2M engine (-23)
    [00:00:48.763,916] <err> net_lwm2m_client_app: LwM2M engine reported a network error.

    Is your project based on our Lwm2m_client? Do you build with any of the included overlays? 

  • Hello,

    we have found out that the error occurs on the provided sample always if the uart2 RX pin 23 is tied 'high' (IO supply voltage). It does not occur when setting to 'low' (GND). The same behavior can be observed when swapping the pin control assignments between uart1 and uart2 as follows. In this case, the error is triggered by setting pin 0 to 'high'.

    
    arduino_serial: &uart1 {
    	status = "okay";
    	current-speed = <115200>;
    	pinctrl-0 = <&uart2_default>;
    	pinctrl-1 = <&uart2_sleep>;
    	pinctrl-names = "default", "sleep";
    };
    
    &uart2 {
    	status = "okay";
    	current-speed = <115200>; /* TODO: change baudrate here 28800 (working) <=> 38400 (not working) */
    	pinctrl-0 = <&uart1_default>;
    	pinctrl-1 = <&uart1_sleep>;
    	pinctrl-names = "default", "sleep";
    };

    This strange behavior was observed on two devkits with the B1 revision and also on one devkit with the B0 revision nRF9160. The modem firmware was 1.3.5 in all cases. We could not try it on the nRF9161 as we were not able to obtain one yet. A binary is attached at the end.

    Kind regards

    8228.merged.hex

  • Hi e-va,

    Could you please try to enable TF-M logging to see if it can give us any clues?

    Best regards,

    Hieu

  • Hi,

    this is the entire uart1 TFM output.

    FATAL ERROR: Platform Exception
    Platform Exception: SPU Fault
    

    Best regards

  • Hi e-va,

    Ok... it is very unfortunate that there is no clear clue there.

    We will continue to investigate this and follow-up with you.

    Best regards,

    Hieu


    P.s: I actually also think about increasing the log level, but as the default level is already INFO, I don't have high hope for it. If it's not too much, do you mind retrying with increased log level as well? You can use these configs:

    CONFIG_TFM_PARTITION_LOG_LEVEL_DEBUG
    CONFIG_TFM_SPM_LOG_LEVEL_DEBUG

Reply Children
  • Hi e-va,

    Could you please explain further how you tie the RX pin to 'high' to produce the issue? I tried your sample code, but when I directly wire the RX pin to VDD, the application couldn't run at all. I am not even sure it left the bootloader. All I got on the log is the boot banner of the bootloader.

    Also, does your description mean that the issue only happens when using pin 23 specifically?
    Or does it only mean the issue happen when UART RX is tied to logic high?
    i.e., if you use a different pin for UART RX, can you reproduce the issue?

    Why do you need the UART RX pin tied to logic high? Is this some startup condition of your system?

    Best regards,

    Hieu

  • Hello Hieu,

    we tie the pins to 'high' / 'low' by connecting the RX pin to either VDD or GND using a jumper cable on the devkit. We can also obverve the behavior you have described, where the application crashes without any further information output on VCOM0 (app UART) and VCOM1 (TF-M UART), when using toolchain 2.4.2 and SDK 2.4.2. When changing to toolchain 2.5.2 and SDK 2.5.99-dev1, a boot loop occurs, and there is the fatal error output on VCOM1. Boot is always successful when changing the UART baud rate to 28800, and always fails when using baud rate 38400 and setting the uart2 RX pin 'high'.
    As described in a previous post, the crash is also observed when swapping the pinctrl assignments between uart1 and uart2. In this case, the crash happens when setting RX pin 0 to 'high', which originally is the uart1 RX pin, and becomes the uart2 RX pin after the swap. It seems that the issue is uart2 specific, and not related to the pins.
    The RX pin is initially 'high' due to the idle state of UART being 'high'. We could probably implement some firmware adaptions on the transmitter side to allow boot by setting TX low or agreeing on a reduced baud rate. Finding the root cause is however preferred to avoid future issues.

    Appending the flags CONFIG_TFM_PARTITION_LOG_LEVEL_DEBUG=y and CONFIG_TFM_SPM_LOG_LEVEL_DEBUG=y in prf.conf yields in the following output from TF-M on uart2 when using toolchain 2.5.2 and SDK 2.5.99-dev1:

    FATAL ERROR: Platform Exception
    Here is some context for the exception:
        EXC_RETURN (LR): 0xFFFFFFF9
        Exception came from secure FW in thread mode.
        xPSR:    0x28000013
        MSP:     0x20000BB8
        PSP:     0x20000BF8
        MSP_NS:  0x00000000
        PSP_NS:  0xFFFFFFFC
        Exception frame at: 0x20000BB8
           (Note that the exception frame may be corrupted for this type of error.)
            R0:   0x00000000
            R1:   0x000000AF
            R2:   0x00000008
            R3:   0xF933ED20
            R12:  0x00000000
            LR:   0x00012B7F
            PC:   0x00012B8A
            xPSR: 0x6900F000
        CFSR:  0x00000000
        BFSR:  0x00000000
        BFAR:  Not Valid
        MMFSR: 0x00000000
        MMFAR: Not Valid
        UFSR:  0x00000000
        HFSR:  0x00000000
        SFSR:  0x00000000
        SFAR: Not Valid
    Platform Exception: SPU Fault
      RAMACCERR

    In case of a successful boot (uart2 RX 'low' or baud rate 28800), the output is:

    [Sec Thread] Secure image initializing!
    TF-M isolation level is: 0x00000001
    Booting TF-M v1.8.0

    Best regards

  • Hello e-va,

    Sorry, the question about why tying UART RX to high was very silly of me. Thank you for answering regardless.

    I run a number more tests today and cannot reproduce your observation.

    Here is a brief of my test results. As I don't have 2.5.99-dev1 installed, I opted for 2.5.2 instead.
    All of them are built with your custom board though.

                                    | Default UART2 | P0.25 as | Uses UART1 
    						        | pins          | UART2 RX | pin control
    ------------------------------------------------------------------------
    Your project, NCS v2.4.2        | Fails         | Works    | Works
    ------------------------------------------------------------------------
    Your project, NCS v2.5.2        | Fails         | Works    | Works
    ------------------------------------------------------------------------
    LwM2M Client sample, NCS v2.5.2 | Works         | Works    | Works

    As you can see, our difference in observation is quite strange, on top of how strange the issue is in the first place.

    I am not quite clear what would be a good next step and will also be off duty for the next two weeks. I will raise this issue internally to see if anyone has an idea for what to do next, and also arrange someone to take over.

    While waiting for further support, could you please let us know if using UART1 instead of UART2 is an acceptable workaround for you now?

    Not sure how much this help, but I will also attach my compiled hex files for references.

    c318755_230301_hex_files.zip

    Best regards,

    Hieu

  • Hello,

    I took over the case temporary from my colleague, Hieu.

    I got a recomendation to check which code is being run when RAMACCERR gets triggered.

    Could you check in your .map file the following address?

            PC:   0x00012B8A
    Best regards,

    Michal

  • Hi all,

    Sorry to awake an old thread, but... I'm running into the same problem here, also on a custom board. Didn't check the pin high/low trick yet, but otherwise all is pretty much identical. Did this ever get resolved?

    I'm using ncs 2.7, my exception report has PC:   0x00013000. If i read the .map file correct (build\zephyr\zephyr_final.map), code starts at 0x18000, so not sure what's happening there..

Related