TF-M crash with certain UART baud rate settings in nRF9160

Dear Nordic Team!

We are struggling with a fatal error occurring during boot phase before entering the application main. When setting the baud rate of UART2 to 38400 or higher, the SPU detects some kind of violation, leading to SPU fault. This also occurs when we copy the device tree in the attachment and some of our prj.conf contents to the lwm2m_client example project. This occurs in both SDK 2.4.2 and 2.5.99-dev1. When running at a baud rate below 38400, the application works as intended and UART communication is ok on UART2.

*** Booting nRF Connect SDK v2.5.99-dev1 ***
I: Starting bootloader
I: Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
I: Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
I: Boot source: none
I: Image index: 0, Swap type: none
I: Bootloader chainload address offset: 0x10000
FATAL ERROR: Platform Exceptionlot
Platform Exception: SPU Fault

The prj.conf lines, which we added to cause the error.

# LTE link control
CONFIG_LTE_LINK_CONTROL=y

# Modem info
CONFIG_MODEM_INFO=y

# Enable settings storage
CONFIG_SETTINGS=y
CONFIG_FCB=y
CONFIG_SETTINGS_FCB=y
CONFIG_FLASH_MAP=y
CONFIG_STREAM_FLASH=y

# Allow FOTA downloads using download-client
CONFIG_LWM2M_CLIENT_UTILS=y
CONFIG_LWM2M_CLIENT_UTILS_FIRMWARE_UPDATE_OBJ_SUPPORT=y
CONFIG_DOWNLOAD_CLIENT=y
CONFIG_DOWNLOAD_CLIENT_STACK_SIZE=4096
CONFIG_DOWNLOAD_CLIENT_HTTP_FRAG_SIZE_1024=y
CONFIG_FOTA_DOWNLOAD=y

We cannot find out how the fatal error could be caused by the UART baud rate setting. Does UART in some way interfere with the boot process?

Thank you

pcb_v70_common.dts

Parents
  • Hello, 

    Are you using a custom board? Do you see the same behavior on the nRF9160DK or Thingy:91? What are you using UART2 for? 

    Please provide more details for me to reproduce.

    Kind regards,
    Øyvind

  • Yes we are using a custom board, and it also happens when we flash the binary to the Nordic nRF9160 Devkit. We created a custom Devicetree, which is attached in the initial post. UART2 is intended to be used for inter-processor communication with a Silicon Labs Bluetooth device.

  • Hello Hieu,

    we tie the pins to 'high' / 'low' by connecting the RX pin to either VDD or GND using a jumper cable on the devkit. We can also obverve the behavior you have described, where the application crashes without any further information output on VCOM0 (app UART) and VCOM1 (TF-M UART), when using toolchain 2.4.2 and SDK 2.4.2. When changing to toolchain 2.5.2 and SDK 2.5.99-dev1, a boot loop occurs, and there is the fatal error output on VCOM1. Boot is always successful when changing the UART baud rate to 28800, and always fails when using baud rate 38400 and setting the uart2 RX pin 'high'.
    As described in a previous post, the crash is also observed when swapping the pinctrl assignments between uart1 and uart2. In this case, the crash happens when setting RX pin 0 to 'high', which originally is the uart1 RX pin, and becomes the uart2 RX pin after the swap. It seems that the issue is uart2 specific, and not related to the pins.
    The RX pin is initially 'high' due to the idle state of UART being 'high'. We could probably implement some firmware adaptions on the transmitter side to allow boot by setting TX low or agreeing on a reduced baud rate. Finding the root cause is however preferred to avoid future issues.

    Appending the flags CONFIG_TFM_PARTITION_LOG_LEVEL_DEBUG=y and CONFIG_TFM_SPM_LOG_LEVEL_DEBUG=y in prf.conf yields in the following output from TF-M on uart2 when using toolchain 2.5.2 and SDK 2.5.99-dev1:

    FATAL ERROR: Platform Exception
    Here is some context for the exception:
        EXC_RETURN (LR): 0xFFFFFFF9
        Exception came from secure FW in thread mode.
        xPSR:    0x28000013
        MSP:     0x20000BB8
        PSP:     0x20000BF8
        MSP_NS:  0x00000000
        PSP_NS:  0xFFFFFFFC
        Exception frame at: 0x20000BB8
           (Note that the exception frame may be corrupted for this type of error.)
            R0:   0x00000000
            R1:   0x000000AF
            R2:   0x00000008
            R3:   0xF933ED20
            R12:  0x00000000
            LR:   0x00012B7F
            PC:   0x00012B8A
            xPSR: 0x6900F000
        CFSR:  0x00000000
        BFSR:  0x00000000
        BFAR:  Not Valid
        MMFSR: 0x00000000
        MMFAR: Not Valid
        UFSR:  0x00000000
        HFSR:  0x00000000
        SFSR:  0x00000000
        SFAR: Not Valid
    Platform Exception: SPU Fault
      RAMACCERR

    In case of a successful boot (uart2 RX 'low' or baud rate 28800), the output is:

    [Sec Thread] Secure image initializing!
    TF-M isolation level is: 0x00000001
    Booting TF-M v1.8.0

    Best regards

  • Hello e-va,

    Sorry, the question about why tying UART RX to high was very silly of me. Thank you for answering regardless.

    I run a number more tests today and cannot reproduce your observation.

    Here is a brief of my test results. As I don't have 2.5.99-dev1 installed, I opted for 2.5.2 instead.
    All of them are built with your custom board though.

                                    | Default UART2 | P0.25 as | Uses UART1 
    						        | pins          | UART2 RX | pin control
    ------------------------------------------------------------------------
    Your project, NCS v2.4.2        | Fails         | Works    | Works
    ------------------------------------------------------------------------
    Your project, NCS v2.5.2        | Fails         | Works    | Works
    ------------------------------------------------------------------------
    LwM2M Client sample, NCS v2.5.2 | Works         | Works    | Works

    As you can see, our difference in observation is quite strange, on top of how strange the issue is in the first place.

    I am not quite clear what would be a good next step and will also be off duty for the next two weeks. I will raise this issue internally to see if anyone has an idea for what to do next, and also arrange someone to take over.

    While waiting for further support, could you please let us know if using UART1 instead of UART2 is an acceptable workaround for you now?

    Not sure how much this help, but I will also attach my compiled hex files for references.

    c318755_230301_hex_files.zip

    Best regards,

    Hieu

  • Hello,

    I took over the case temporary from my colleague, Hieu.

    I got a recomendation to check which code is being run when RAMACCERR gets triggered.

    Could you check in your .map file the following address?

            PC:   0x00012B8A
    Best regards,

    Michal

  • Hi all,

    Sorry to awake an old thread, but... I'm running into the same problem here, also on a custom board. Didn't check the pin high/low trick yet, but otherwise all is pretty much identical. Did this ever get resolved?

    I'm using ncs 2.7, my exception report has PC:   0x00013000. If i read the .map file correct (build\zephyr\zephyr_final.map), code starts at 0x18000, so not sure what's happening there..

  • Hello basvkesteren,

    we did not find time to further investigate the problem after checking the map file and also being unsure about the interpretation. For now, we settled with keeping the baud rate below the problem triggering threshold. We did not see the problem again at this baud rate.

    BR

Reply Children
Related