SDK 17.1.0: secure bootloader runtime failure when compiled with current gcc (11.2.0)

Hi, I have encountered the following problem, using SDK 17.1.0:

secure bootloader compiled with current gcc stable version (11.2.0) fails during runtime with the following error:

<info> app: Entering DFU mode.
<debug> app: Initializing transports (found: 1)
<debug> nrf_dfu_ble: Initializing BLE DFU transport
<debug> app: Failed to initialize transport 0, error 9
<error> app: Could not initalize DFU transport: 0x00000009
<error> app: Received an error: 0x00000003!

And sometimes, when trying many combinations of everything, I was getting error 7.

There's also small problem during compilation with components/libraries/bootloader/ble_dfu/nrf_dfu_ble.c file - there are warnings treated as errors, but temporary removal of

CFLAGS += -Wall -Werror

from Makefile silences this.

Earlier I noticed that whole SDK somewhat suggests gcc 9.3.0 (I had to modify components/toolchain/gcc/Makefile.posix to use my system-wide installed compiler, instead of the path hardcoded there). So, I finally tried compilation with this version of compiler. I had to install it side by side with the current compiler, compile the bootloader with it, and voila - bootloader works.

So there's certainly some problem depending on the compiler version. And that's odd, because I have dozen of various ARM projects here, using MCUs from NXP, Freescale, ST, and current compiler version works flawlessly with all of them. Even my own nRF firmware that I'm currently working on, works just fine with the fresh compiler. Bootloader is the only piece of code that's causing trouble.

Yes, I can hear you - just use that old compiler. That's acceptable as temporary solution, but not really maintainable, looking into the future.

I'm using unmodified bootloader straight from SDK examples, so I think it'd be easily reproducible.

Please have a look at the issue.

Parents
  • Hi Michael, 

    I have tried here and got the same error when testing with gcc v11.2

    I managed to trace back the error down to the     err_code = nrf_dfu_mbr_init_sd(); call in ble_stack_init(). It's the function that returns error 9. 

    It's a call to the MBR. It must be something wrong with the declaration of SVC call. I checked the source of MBR and it has no option to return error 9 from there.

    The problem is that we don't test with any other gcc compiler version (and we can't keep continue to test with new gcc releases) except for the one comes default in the SDK. Without thorough test we can't recommend to use newer gcc compiler. So we don't have any other recommendation other than you may need to attach the correct gcc version to the SDK setup and make it separate from your system wide compiler. 

    (FYI: At least in my test here the last version of GNU ARM Embedded Toolchain (before they merged ) v 10.3-2021.10 the nrf_dfu_mbr_init_sd() function worked fine and returned 0. )

Reply
  • Hi Michael, 

    I have tried here and got the same error when testing with gcc v11.2

    I managed to trace back the error down to the     err_code = nrf_dfu_mbr_init_sd(); call in ble_stack_init(). It's the function that returns error 9. 

    It's a call to the MBR. It must be something wrong with the declaration of SVC call. I checked the source of MBR and it has no option to return error 9 from there.

    The problem is that we don't test with any other gcc compiler version (and we can't keep continue to test with new gcc releases) except for the one comes default in the SDK. Without thorough test we can't recommend to use newer gcc compiler. So we don't have any other recommendation other than you may need to attach the correct gcc version to the SDK setup and make it separate from your system wide compiler. 

    (FYI: At least in my test here the last version of GNU ARM Embedded Toolchain (before they merged ) v 10.3-2021.10 the nrf_dfu_mbr_init_sd() function worked fine and returned 0. )

Children
  • Thank you for your insight and support, prompt answer is really appreciated.

    Unfortunately, that's the kind of answer I was expecting.

    It must be something wrong with the declaration of SVC call.

    Probably. I didn't try to debug it on my own, I thought that you are the right instance to track down the problem. You built the hardware, your wrote all the pieces of software. I must accept your decision to not dig deeper into the problem as thoughtful one. I respect your priorities.

    Regarding the preferred choice of the compiler, I beg to differ, please bear with me:

    When any software vendor is asked "which version of your software should be used", usually the correct answer is "use the last stable version, it's the best one as of today". When I ask you, you probably will point me to the newest version of your SDK and SoftDevice.

    gcc current stable version is 11.2. That's what gcc authors consider as the current compiler. I consider this as the most authoritative opinion. The "blessed" one. But I can understand that using fresh versions of all software is constant churn and sometimes too much.

    The second level of blessedness is ARM's distribution of gcc. They decided to stay at one major version behind. It's considerable choice, maybe they prefer to do more testing which is time consuming, maybe there are other objections to stay on the bleeding edge of current compiler technology. Even Debian stable linux distribution, known to be as the software archeology site, is already at gcc major version 10. So let's assume this is the conservative choice, the least common denominator.

    Your choice to suggest version 9 that is two major version behind is odd. It's obsolete and archival.

    The problem is that we don't test with any other gcc compiler version (and we can't keep continue to test with new gcc releases)

    Why not? OK, latest newest versions, I can agree. But using current version of ARM's Embedded Toolchain should be good practice if not even mandatory, not some obsolete one. I'm aware that I'm not in the position to tell you what you should and shouldn't, so please consider this as polite suggestion.

    Keeping to the obsolete version of compiler is problematic, at least for considerable group of your users. There are ongoing compatibility and maintenance issues. Linux ecosystem, especially development one, moves quite fast. System libraries and other dependencies break. Yes, I know, I can always compile the compiler from sources...

    I'm also using C++ in my projects, and try to follow latest language developments. Getting stuck at old compiler version is prohibitive here, I just can't use the good niceties from modern C++. std::byte is a nice abstraction over uint8_t everywhere.

    Let's also imagine I'm using some substantial library from another vendor in my project, and this vendor did some mistakes and their piece of code doesn't compile with your selected version of the compiler, and I'm expected to use some another random ancient version. Hell...

    v 10.3-2021.10 the nrf_dfu_mbr_init_sd() function worked fine and returned 0.

    So gcc 10 has a chance to work. But as you say, you don't test your software with this version, only version 9. If gcc 11 doesn't work, then there's chance that there's some code that is trying to outsmart the compiler and can introduce compiler dependent bugs, so using untested by you gcc 10 also doesn't feel safe.

    you may need to attach the correct gcc version to the SDK setup

    If your freshly released software, as of today, doesn't compile and work properly using contemporary compiler version, I consider this to be a bug, the definite one. In your software, or in compiler :)

    So I kindly ask you to reconsider your stance, and not stay so far in the past with the compiler choice.

    Nevertheless, once again thank you for your prompt support! If you read anything above as harsh, English is my foreign language and I had no intention to. Just a little bit of astonishment.

    Best regards,

  • Hi Michael,

    You are right and your suggestions make a lot of sense. Normally, we would keep our SDK compatible with the latest tool chain update and try to match with the newest gcc version. 

    However, the nRF5 SDK is now considered legacy development platform and we have stopped releasing new major version of it since June 2020 (SDK v17.0). The resource is now focusing on nRF Connect SDK. It's likely that v17 will be the last major version of the nRF5 SDK. 

    I hope you understand our situation and consider using the gcc version that's tested with the SDK. But of course please provide your feedback to our sales representative in your country. We always listen to you and will try our best to help solving your problem. 

  • Hi Hung,

    I am currently running 10.3-2021.10 and am having the same error with SDK 17.1.0.  Are you positive it is a compiler issue?

  • @kentorth: Please create a new ticket for your question. Please try using the version that's recommended for the SDK: GCC ARM Embedded 9.2020-q2.major

Related