HARDFAULT in mesh SDK ble_mesh_v0.9.1-Alpha access.c

SimonJudge gravatar image

asked 2017-08-11 11:21:56 +0100

updated 2017-08-11 11:27:15 +0100

I have the mesh serial example working with nRF51-DK (nRF51422). I am now trying to run the same code, via the nRF51-DK debug port, to a nRF51822 QFAC (32Mb) based board. It downloads and runs up to the following line where I get a HARDFAULT:

static void access_state_clear(void)
    memset(&m_model_pool[0], 0, sizeof(m_model_pool));

A break at this line shows the m_model_pool memory is allocated (it's static anyway) and accessible. Stepping over this line gives a HARDFAULT. What might be causing this?

The nRF51-DK (nRF51422) and the nRF51822 QFAC have the same memory maps so I am using the same build settings.


edit retag flag offensive close delete report spam

1 answer

Sort by ยป oldest newest most voted
thomas.stenersen gravatar image

answered 2017-08-11 21:29:28 +0100

updated 2017-08-11 21:31:18 +0100

Hi Simon,

The Bluetooth mesh stack operates in a SoftDevice Timeslot to allow running both the mesh stack and the SoftDevice concurrently. This causes a problem during debugging, though. A timeslot is a window of time that can be requested from the SoftDevice where it hands over the ownership of the radio (and some other peripherals). If the application doesn't yield the timeslot before its end, the SoftDevice will trigger an assert/HARDFAULT, since its internal scheduler cannot meet its deadlines any more and is left in an undefined state.

That's what's happening when you're stepping your code. When doing a step, the debugger re-enables the CPU, the TIMER0 interrupt fires and the SoftDevice triggers the assert. To circumvent this, you can use "step instruction" instead of the normal "step".

When you get an assert or HARDFAULT, the examples should print the program counter over RTT. To get the file and line, use the addr2line utility provided with the GNU toolchain(s).

addr2line -a <program counter value> -e <path to my_program.elf>

Make sure you've compiled your program with debugging symbols.

Please note that there is a bug in the the serial_bearer.c, where the packet buffers needs to be word aligned. See fix in the diff below.

/********** Static variables **********/

+/* Buffers must be word-aligned to the largest possible serial packet. */
+#define RX_BUFFER_SIZE ALIGN_VAL(sizeof(serial_packet_t) + sizeof(packet_buffer_packet_t), WORD_SIZE)
+#define TX_BUFFER_SIZE ALIGN_VAL(sizeof(serial_packet_t) + sizeof(packet_buffer_packet_t), WORD_SIZE)
-static uint8_t m_tx_buffer[sizeof(serial_packet_t) + sizeof(packet_buffer_packet_t)];
+static uint8_t m_tx_buffer[TX_BUFFER_SIZE];
 static packet_buffer_t m_tx_packet_buf;
 static packet_buffer_packet_t * mp_current_tx_packet;
 static uint16_t m_cur_tx_packet_index;
 static uint16_t m_stored_pac_len;
-static uint8_t m_rx_buffer[sizeof(serial_packet_t) + sizeof(packet_buffer_packet_t)];
+static uint8_t m_rx_buffer[RX_BUFFER_SIZE];


edit flag offensive delete publish link more



Thanks for the explanations. The word aligned fix has allowed me to get further. It's now hardfaulting when nrf_mesh_serial_init() is called. I have tracked this down to:

uECC_VLI_API int uECC_generate_random_int(uECC_word_t *random,
                                          const uECC_word_t *top,
                                          wordcount_t num_words) {
        if (!g_rng_function((uint8_t *)random, num_words * uECC_WORD_SIZE)) {

Are these any more data alignment problems I need to know about?



Simon Judge ( 2017-08-14 12:56:48 +0100 )editconvert to answer

Hi Simon,

I have never seen any problems like this, unfortunately. Are you able to provide a full backtrace? In GDB use backtrace full to print it out. Are you building the examples with CMake or Segger Embedded Studio? If not, maybe you're not building the uECC library correct? The default for the serial interface is to enable ECDH offloading, i.e., offloading the ECDH shared secret computation to the PC. Thus, the uECC library shouldn't even really be needed.


Thomas Stenersen ( 2017-08-14 13:08:36 +0100 )editconvert to answer

The uECC library is being used. This is with Segger ES and the example Nordic supplied project file. This is the call stack: uECC_generate_random_int() uECC_make_key() prov_utils_keys_generate() nrf_mesh_prov_generate_keys() serial_handler_prov_init() nrf_mesh_serial_init() start() main()

Looking at the uECC_generate_random_int code, it calls into the softdevice to get a random number. Breakpoints show this call, to get a random number, is called and works earlier in the code prior to the nrf_mesh_serial_init() call.

It's strange this works with nRF51422 on the DK board but not with an external nRF51822.


Simon Judge ( 2017-08-22 14:13:06 +0100 )editconvert to answer

The problem also occurs in v0.9.2

Simon Judge ( 2017-08-23 16:44:46 +0100 )editconvert to answer

Hi Simon,

This is very strange, indeed. Here's a some debugging questions:
- Did you modify the example code in any way?
- Are you using a different clock source on the external board, and is the softdevice (and mesh) initialized with the correct one?
- Did you get a program counter address with the HARDFAULT? If so, could you provide the .elf and .map file along with the address of the fault?
- Is anything of this running in an IRQ? Calling a softdevice function from an interrupt priority higher than the SVC will hardfault.

I hope we'll get the problem solved soon.

Thomas Stenersen ( 2017-08-24 08:43:44 +0100 )editconvert to answer

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer. Do not ask a new question or reply to an answer here.

[hide preview]

User menu

    or sign up

Recent questions

Question Tools



Asked: 2017-08-11 11:21:56 +0100

Seen: 89 times

Last updated: aug. 11