TF-M crash with certain UART baud rate settings in nRF9160

Dear Nordic Team!

We are struggling with a fatal error occurring during boot phase before entering the application main. When setting the baud rate of UART2 to 38400 or higher, the SPU detects some kind of violation, leading to SPU fault. This also occurs when we copy the device tree in the attachment and some of our prj.conf contents to the lwm2m_client example project. This occurs in both SDK 2.4.2 and 2.5.99-dev1. When running at a baud rate below 38400, the application works as intended and UART communication is ok on UART2.

*** Booting nRF Connect SDK v2.5.99-dev1 ***
I: Starting bootloader
I: Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
I: Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
I: Boot source: none
I: Image index: 0, Swap type: none
I: Bootloader chainload address offset: 0x10000
FATAL ERROR: Platform Exceptionlot
Platform Exception: SPU Fault

The prj.conf lines, which we added to cause the error.

# LTE link control
CONFIG_LTE_LINK_CONTROL=y

# Modem info
CONFIG_MODEM_INFO=y

# Enable settings storage
CONFIG_SETTINGS=y
CONFIG_FCB=y
CONFIG_SETTINGS_FCB=y
CONFIG_FLASH_MAP=y
CONFIG_STREAM_FLASH=y

# Allow FOTA downloads using download-client
CONFIG_LWM2M_CLIENT_UTILS=y
CONFIG_LWM2M_CLIENT_UTILS_FIRMWARE_UPDATE_OBJ_SUPPORT=y
CONFIG_DOWNLOAD_CLIENT=y
CONFIG_DOWNLOAD_CLIENT_STACK_SIZE=4096
CONFIG_DOWNLOAD_CLIENT_HTTP_FRAG_SIZE_1024=y
CONFIG_FOTA_DOWNLOAD=y

We cannot find out how the fatal error could be caused by the UART baud rate setting. Does UART in some way interfere with the boot process?

Thank you

pcb_v70_common.dts

Parents

0 Øyvind over 1 year ago

Hello,

Are you using a custom board? Do you see the same behavior on the nRF9160DK or Thingy:91? What are you using UART2 for?

Please provide more details for me to reproduce.

Kind regards,
Øyvind
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 e-va over 1 year ago in reply to Øyvind

Yes we are using a custom board, and it also happens when we flash the binary to the Nordic nRF9160 Devkit. We created a custom Devicetree, which is attached in the initial post. UART2 is intended to be used for inter-processor communication with a Silicon Labs Bluetooth device.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 e-va over 1 year ago in reply to Hieu
Hello Hieu,

we tie the pins to 'high' / 'low' by connecting the RX pin to either VDD or GND using a jumper cable on the devkit. We can also obverve the behavior you have described, where the application crashes without any further information output on VCOM0 (app UART) and VCOM1 (TF-M UART), when using toolchain 2.4.2 and SDK 2.4.2. When changing to toolchain 2.5.2 and SDK 2.5.99-dev1, a boot loop occurs, and there is the fatal error output on VCOM1. Boot is always successful when changing the UART baud rate to 28800, and always fails when using baud rate 38400 and setting the uart2 RX pin 'high'.
As described in a previous post, the crash is also observed when swapping the pinctrl assignments between uart1 and uart2. In this case, the crash happens when setting RX pin 0 to 'high', which originally is the uart1 RX pin, and becomes the uart2 RX pin after the swap. It seems that the issue is uart2 specific, and not related to the pins.
The RX pin is initially 'high' due to the idle state of UART being 'high'. We could probably implement some firmware adaptions on the transmitter side to allow boot by setting TX low or agreeing on a reduced baud rate. Finding the root cause is however preferred to avoid future issues.

Appending the flags CONFIG_TFM_PARTITION_LOG_LEVEL_DEBUG=y and CONFIG_TFM_SPM_LOG_LEVEL_DEBUG=y in prf.conf yields in the following output from TF-M on uart2 when using toolchain 2.5.2 and SDK 2.5.99-dev1:

FATAL ERROR: Platform Exception Here is some context for the exception: EXC_RETURN (LR): 0xFFFFFFF9 Exception came from secure FW in thread mode. xPSR: 0x28000013 MSP: 0x20000BB8 PSP: 0x20000BF8 MSP_NS: 0x00000000 PSP_NS: 0xFFFFFFFC Exception frame at: 0x20000BB8 (Note that the exception frame may be corrupted for this type of error.) R0: 0x00000000 R1: 0x000000AF R2: 0x00000008 R3: 0xF933ED20 R12: 0x00000000 LR: 0x00012B7F PC: 0x00012B8A xPSR: 0x6900F000 CFSR: 0x00000000 BFSR: 0x00000000 BFAR: Not Valid MMFSR: 0x00000000 MMFAR: Not Valid UFSR: 0x00000000 HFSR: 0x00000000 SFSR: 0x00000000 SFAR: Not Valid Platform Exception: SPU Fault RAMACCERR

In case of a successful boot (uart2 RX 'low' or baud rate 28800), the output is:

[Sec Thread] Secure image initializing! TF-M isolation level is: 0x00000001 Booting TF-M v1.8.0

Best regards
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Hieu over 1 year ago in reply to e-va
Hello e-va,

Sorry, the question about why tying UART RX to high was very silly of me. Thank you for answering regardless.

I run a number more tests today and cannot reproduce your observation.

Here is a brief of my test results. As I don't have 2.5.99-dev1 installed, I opted for 2.5.2 instead.
All of them are built with your custom board though.

| Default UART2 | P0.25 as | Uses UART1 | pins | UART2 RX | pin control ------------------------------------------------------------------------ Your project, NCS v2.4.2 | Fails | Works | Works ------------------------------------------------------------------------ Your project, NCS v2.5.2 | Fails | Works | Works ------------------------------------------------------------------------ LwM2M Client sample, NCS v2.5.2 | Works | Works | Works

As you can see, our difference in observation is quite strange, on top of how strange the issue is in the first place.

I am not quite clear what would be a good next step and will also be off duty for the next two weeks. I will raise this issue internally to see if anyone has an idea for what to do next, and also arrange someone to take over.

While waiting for further support, could you please let us know if using UART1 instead of UART2 is an acceptable workaround for you now?

Not sure how much this help, but I will also attach my compiled hex files for references.

c318755_230301_hex_files.zip

Best regards,

Hieu
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Michal over 1 year ago in reply to Hieu

Hello,

I took over the case temporary from my colleague, Hieu.

I got a recomendation to check which code is being run when RAMACCERR gets triggered.

Could you check in your .map file the following address?

PC: 0x00012B8A
Best regards,

Michal
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 basvkesteren 6 months ago in reply to Michal

Hi all,

Sorry to awake an old thread, but... I'm running into the same problem here, also on a custom board. Didn't check the pin high/low trick yet, but otherwise all is pretty much identical. Did this ever get resolved?

I'm using ncs 2.7, my exception report has PC: 0x00013000. If i read the .map file correct (build\zephyr\zephyr_final.map), code starts at 0x18000, so not sure what's happening there..
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 e-va 6 months ago in reply to basvkesteren

Hello basvkesteren,

we did not find time to further investigate the problem after checking the map file and also being unsure about the interpretation. For now, we settled with keeping the baud rate below the problem triggering threshold. We did not see the problem again at this baud rate.

BR
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Reply

0 e-va 6 months ago in reply to basvkesteren

Hello basvkesteren,

we did not find time to further investigate the problem after checking the map file and also being unsure about the interpretation. For now, we settled with keeping the baud rate below the problem triggering threshold. We did not see the problem again at this baud rate.

BR
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Children

0 basvkesteren 6 months ago in reply to e-va

Thanks!

Guess I'll have to do the same, for now.

Anyone from Nordic with an explanation for this? Hieu ?
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Michal 5 months ago in reply to basvkesteren

We don't have any concrete explanation unfortunately. I will discuss it with my colleague if there is anything more that can be done here.

Best regards,

Michal
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Michal 5 months ago in reply to Michal

Could you make another ticket where you describe your issue completely and refer to this one as well?

With the exact pins used and your .dts file?

Best regards,

Michal
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel