Beware that this post is related to an SDK in maintenance mode
More Info: Consider nRF Connect SDK for new designs
This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Internal error from RADIO_IRQHandler in GZLL_dynamic_pairing example using SES.

(added comment regarding intitialisation of GZP_PARAMS sunday 19 Aug)

I would very much like to use the gzp_dynamic_pairing package on the nRF52. Unfortunately I cannot afford Keil or other expensive IDEs. I tried unsuccessfully to use Eclipse then switched to command line with the ARMgcc tools. When I learned that Nordic had a deal with SEGGER to use SES (thank you very much Nordic) I have tried to switch to that, it being radically easier than 10 terminal windows open on two Linux screens. I succeded in installing SES and rebuilding + debugging several examples including the gzll_ack_payload on two  development cards PCA10040.

But, there I met a bit of a barrier, there is no SES or ARMgcc enviroments set up for the gazell dynamic pairing example. So I set about creating them by starting from the gzll_ack_payload SES setup. I have succeded in compiling and building (with one or two source changes (see later)) both device and host.
I can run the.hex files provided with the development kit for both host and device apparently successfully so the hardware seems to be OK.
I can replace the device .HEX with one built from source on the SES environment. It runs successfully as far as I can tell. If I then replace the host .HEX with one built on SES it fails.

If I run the host without turning on the device, the host announces itself thus "<info> app: Gazell dynamic pairing example started. Host mode." and awaits the client. The moment that the client begins, the host crashes out to an exception relating to the internal logic of the RADIO_IRQHandler. This is true either with the "device" software I compiled or the original .hex file for the device.

I have struggled for a week trying to discover what is wrong with the environment which might cause this.

Now for the details.

I amended nrf_gzll_disable to add in asserts to get closer to the error thus (leaving the other ASSERTS in situ):

static void gzll_goto_idle()
{ int count;
count = 0;
nrf_gzll_disable();
ASSERT(nrf_gzll_get_error_code() == NRF_GZLL_ERROR_CODE_NO_ERROR);

while (nrf_gzll_is_enabled())
{count = count +1;
ASSERT(nrf_gzll_get_error_code() == NRF_GZLL_ERROR_CODE_NO_ERROR);  <<<<<<<<<<<
}
}


Usually the back trace of the error is as follows:

main()
gzp_host_execute()                (from line 167 of main())
gzp_process_address_req(rx_payload)           (from line 366 nrf_gzp_host.c)
gzll_goto_idle()
assert_nrf_callback(...) (from ASSERT(nrf_gzll_get_error_code() == NRF_GZLL_ERROR_CODE_NO_ERROR) (from the line marked <<<<< above. )
app_error_fault_handler(NRF_FAULT_ID_SDK_ASSERT, 0, (uint32_t)(&assert_info)) (from line 51 nrf_assert.c)


with "count" between 160 and 170. BUT sometimes it trips at an assert placed in main.c immediately after gzp_host_execute() and sometimes at an assert placed after the call to gzll_goto_idle() in gzp_process_address_req(). i.e. it is certainly time dependent (though fairly consistent).
All the errors are the same as far as I can see, only the timing is different.

Unfortunately none of these are particularly helpful in determining the root cause of the trap since app_error_fault_handler has read an internal gzll variable called "m_nrf_gzll_error_code" which has been written from an interrupt (I believe).
Putting a watch breakpoint on byte "m_nrf_gzll_error_code" leads to the following:

0x0F is written to "m_nrf_gzll_error_code" (1 byte) by nrf_assert_internal_callback (0 is "no error")
0x000a04f6 is written to "m_nrf_gzll_internal_debug_code" (4 bytes) in the same subroutine.

The error code was initiated by RADIO_IRQHandler loaded at 0xA0A0 via a call to nrf_assert_internal_parse_and_forward from location 0xA2A6
i.e. 0x0206 bytes from the start of "RADIO_IRQHandler". "nrf_assert_internal_parse_and_forward" is synonymous with "nrf_assert_internal_callback"
since the former does an unconditional branch to the latter.
In order to decide to call "nrf_assert_internal_parse_and_forward" RADIO_IRQHandler tests the value in register offset 0x110 from the APB radio base of 0x40001000
which is "radio is disabled" against value 1 and skips the error call if true.

Thus it would appear that the radio is expected to be off when it is not and hence an internal error is declared.
The error manifests itself in the loop awaiting the radio to be off so there seems to be a problem in the "turn off" system but I don't know how it works to debug any further.
The only definition of errors that I have found is "nrf_gzll_error_code_t" which, for 0x0F reads "An invalid channel table size was given as an input to a function".
At the error I have checked the channel table which contains 5 entries, all of which seem OK (as set by the program). Furthermore, the routine which sets 0x0F is called
from many places in "RADIO_IRQHandler" so it seems unlikely that "nrf_gzll_error_code_t" is being used.
Therefore I conclude that I do not know what error 0x0F is nor the meaning of the additional information 0xA2A6.


I am hoping someone with access to the gzll source code can use them to give me advice as to the root cause of the problem.


There just MUST be something wrong with my configuration of SES since the code has been compiled to produce the working .hex file supplied with the Development environment
but for the life of me I cannot find it.

If someone could give me a hint as to how I could trigger this internal error by making a screw-up in the development environment I would be grateful. (or is this the reason that there is no ARMgcc or SES enviroments provided for this example???)

The entire environment is much too big to post (170Mb), but if anyone wants what I have done, I will arrange to Dropbox or Onedrive it to you or any files from it you wish.

What have I tried.

1. calling nrf_gzll_disable(); and "result_value = nrf_gzll_enable(); GAZELLE_ERROR_CODE_CHECK(result_value);"" many times in various places in the code. None triggered the exception.
2. putting in some delays by printing out to the debug console in a loop in various places (e.g. between disable and testing loop). No impact detectable.


The source changes that have been made in order to compile and make it run at all:

Change 1:

To eliminate the following compiler error caused by the fact that ARMgcc does not implement the @ directive for direct linker placement of variables:
\nRF5_SDK_15.0.0_a53641a\components\proprietary_rf\gzll\nrf_gzp_host.c:260:1: warning: 'at' attribute directive ignored [-Wattributes]

So I changed the lines in nrf_gzp_host.c from:
249 #if defined(__ICCARM__)
250 #if GZP_PARAMS_DB_ADR == 0x1000
251 static const uint32_t database[GZP_DEVICE_PARAMS_STORAGE_SIZE/4] @ "gzp_dev_data"
252 #elif GZP_PARAMS_DB_ADR == 0x15000
253 static const uint32_t database[GZP_DEVICE_PARAMS_STORAGE_SIZE/4] @ "gzp_dev_data_sd"
254 #else
255 #error
256 #endif
257 #else
258 static const uint32_t database[GZP_DEVICE_PARAMS_STORAGE_SIZE / 4] __attribute__((at(GZP_PARAMS_DB_ADR)))
259 #endif
to:
249 #if defined(__ICCARM__)
250 #if GZP_PARAMS_DB_ADR == 0x1000
251 static const uint32_t database[GZP_DEVICE_PARAMS_STORAGE_SIZE/4] @ "gzp_dev_data"
252 #elif GZP_PARAMS_DB_ADR == 0x15000
253 static const uint32_t database[GZP_DEVICE_PARAMS_STORAGE_SIZE/4] @ "gzp_dev_data_sd"
254 #else
255 #error
256 #endif
257 #elif defined(__GNUC__)
258 static volatile const uint32_t database[GZP_DEVICE_PARAMS_STORAGE_SIZE / 4] __attribute__((section(".GZP_PARAMS")))
259 #else
260 static const uint32_t database[GZP_DEVICE_PARAMS_STORAGE_SIZE / 4] __attribute__((at(GZP_PARAMS_DB_ADR)))
261 #endif

i.e. added lines 257 and 258 in order to use the gnu C option for __attribute__ and added a section placement in the "Solution options" for the SEGGER project thus:

.GZP_PARAMS RX 0x15000 0x1000

There are no FLASH placement files in use.

Change 2:

Without a soft device GZP_PARAMS_DB_ADR is 0x1000 which is in the middle of the FLASH code for the gzp pairing example, so I changed it to 0x15000 (as above).
This is found in RF5_SDK_15.0.0_a53641a\examples\proprietary_rf\gzll\gzp_dynamic_pairing\host\config\nrf_gzp_config.h

Added note:

I have confirmed that memory from 0x15000 for 0x1000 bytes is written with FFFFFFFF in the .HEX file as the code definition for database[GZP_DEVICE_PARAMS_STORAGE_SIZE/4] goes to some lengths to arrange (REP4 defines). What is REALLY odd is that the .hex file provided ready-made for the dynamic pairing has no such initialisation, neither at 0x1000 nor 0x15000 nor anywhere else. The maximum number of consecutive bytes written with FF is 16. There again, I don't really know how significant that is.

Info:

I am a retired (73 year old) computer techie struggling to learn how to program the nRF52 using the cheap (read free) command line tools of gnu and Linux. Now changing to SEGGER.
Memory (mine) is not what it used to be (it takes many writes to make it permanent) and neither is brain speed (now underclocked) so please bear that in mind.

Hardware:

Hardware = 2 off PCA10040 V1.1.1 2017 5 682465971 and 682839444

Software:

nRF52 Software Development kit is nRF5_SDK_15.0.0_a53641a


On Windows:
SEGGER Embedded Studio for ARM
Release 3.40 Build 2018052200.36079
Windows 10 x64
GCC/BINUTILS: Built using the GNU ARM Embedded Toolchain version 7-2017-q4-major source distribution

on Linux:
Suse Leap 42.3.20170911 64 bit fully updated
on 16 Gbyte memory Intel Core i7-2600K CPU @ 3.40GHz
GCC Arm toolchain = gcc-arm-none-eabi-7-2017-q4-major
(actual version is arm-none-eabi-gcc 8.1.0)
using gdb via connection to SEGGER J-Link GDB server using OB link over ARM SWD
SEGGER J-Link GDB server V6.32i July 24 2018
SEGGER J-Link RTT Client   Compiled Jul 24 2018 15:21:19 (Version unknown)
PuTTY: Release 0.68 Build platform: 64-bit Unix (GTK + X11)
Compiler: gcc 4.8.5 Compiled against GTK version 2.24.31

  • SES compiles much faster on linux than on windows, but works great in both. It's a great alternative to using gcc+gdb in *nix based platforms. I haven't had any issue with UART or RTT in both OSes, but i usually use normal debug mode when debugging is needed. That might have masked some potential issues.

  • Thanks Håkon, you have given me the confidence to carry on - a wonderful service from Nordic so I thank you very much.

    You are right, SES is a very professional environment. It has a few quirks that takes getting used to, but in general I learned it pretty quickly. Bye.

  • Thank you Nordic team for your precious help. Is by any chance the project sdk_15_gzp_dynamic_pairing_ses_device.zip also available to download? Is there any particular reason why both these _ses projects are not by default provided inside the latest sdk (15.3)? Inside the keil gzll device project I can see these preprocessor symbols:

    BOARD_PCA10056 BSP_SIMPLE CONFIG_GPIO_AS_PINRESET FLOAT_ABI_HARD GAZELL_ALTERNATIVE_RESOURCES GAZELL_PRESENT NRF52840_XXAA SWI_DISABLE0 USE_SD_HW_RESOURCES _HEAP_SIZE=8192 _STACK_SIZE=8192

    Being the macro SOFTDEVICE_PRESENT not defined, as well as the NO_VTOR_CONFIG one, may you please confirm that the precompiled library gzll_nrf52840_sd_resources_gcc.a doesn't make use of the SoftDevice (s140 for pca10056 board)? If so, the sd prefix in the name of the library seems to be a little bit confusing. In this case (SoftDevice not used) can I safely use the following simple linker section placement macro:

    linker_section_placement_macros="FLASH_PH_START=0x0;FLASH_PH_SIZE=0x100000;RAM_PH_START=0x20000000;RAM_PH_SIZE=0x40000;FLASH_START=0x0;FLASH_SIZE=0x100000;RAM_START=0x20000000;RAM_SIZE=0x40000"
    linker_section_placements_segments="FLASH RX 0x0 0x100000;RAM RWX 0x20000000 0x40000" ?

    It is very important to understand how to properly set memory for the gzll device project. I had it already up and running in the ses environment, nevertheless the program was not able to correctly start an encrypted dialogue with the host. Apparently many routines in gzp.c file failed when calling memcpy: the routine was not able to copy memory in the right way, even when I used a custome memcpy implementation. May I ask if the information reported in the linker section placement macro is always in synch with the content of the .emProject xml file? The .emProject xml file should never be edited by hand? 

    In your project gzll_ack_payload_device_pca10056 (that does use the gzll_nrf52840_sd_resources_gcc.a library) I also found inside the .emProject file:

    c_preprocessor_definitions="BOARD_PCA10056;CONFIG_GPIO_AS_PINRESET;FLOAT_ABI_HARD;GAZELL_ALTERNATIVE_RESOURCES;GAZELL_PRESENT;INITIALIZE_USER_SECTIONS;NO_VTOR_CONFIG;NRF52840_XXAA;SWI_DISABLE0;USE_SD_HW_RESOURCES;"

    and

          linker_section_placement_macros="FLASH_PH_START=0x0;FLASH_PH_SIZE=0x100000;RAM_PH_START=0x20000000;RAM_PH_SIZE=0x40000;FLASH_START=0x0;FLASH_SIZE=0x100000;RAM_START=0x20000000;RAM_SIZE=0x40000"

    I just got started using a clean gzll_ack_payload_device_pca10056 as a template project. In order to build the dynamic pairing device program, I had to copy the nrf_gzp_config.h file, add a gzp include directory, add nrf_gzp.c, nrf_gzp_device.c, nrf_ecb.c and nrf_nvmc.c. The project does build successfully, I'm trying to debug it, though. As it keeps going hardfault, using all other default configurations, when in Release mode. When in Debug mode the program seems to run but the gazell device transmission is like transparent to the host device (even if they share the same secret key and other parameters in nrf_gzp_config.h). Thanks in advance for taking the time to reply 

Related