sd_ecb_block_encrypt() execution time

We have a custom protocol with strict deadlines in the radio stack between receiving a packet, decrypting it, processing encrypting an ack and send the ack.

For that reason, we have strict deadlines on sd_ecb_block_encrypt(), and it seems to work for every case, except now one.

In this case, we are running:

- nrf52840

- Softdevice s140 v7.2.0

- nrf5-sdk 17.0.2

- FreeRTOS 10.0.0, the version shipped with nrf5-sdk

Our setup runs all radio code from the radio interrupt in an allocated radio slot, and thus on priority 0. Except the calls to softdevice, in which we trigger an SWI and drops to interrupt priority 5 for that call, and has the deadline specified on a TIMER0 compare. If we miss the deadline, then we log and retries later. Otherwise, we don't have any other interrupts during that period

I have measured that the call to sd_ecb_block_encrypt() and a block copy takes 15.8us in every case we have without running FreeRTOS, or when using that version of FreeRTOS while connected to a debugger, but in running state. But when disconnecting the debugger and reset the CPU, many of those calls starts 26.8us each, in a rate so we miss the deadline when sending bigger packets.

Also have a version where we upgraded FreeRTOS to a later version, and the issue seems to be gone.

We are not running any concurrent bluetooth, or having any other code using the encyrption.

To measure it, I'm using a saleae logic analyzer and pull a pin low just before the call, and high afterwards, to be able to measure timings without debugger/trace. I can't really find a pattern for it, but it seems to affect when running multiple in a row. Note: In the plot low signal means call to sd_ecb_block_encrypt()



I would have preferred to provide more steps to reproduce for this issue, but by the nature of not being able to track any pattern or being able to look at it with a debugger make it lot harder. But feeling uncomfortable without knowing the reason for this issue, since we don't know when it will appear or how to get around it. And thus lost in what to test

So what could make sd_ecb_block_encrypt() increase in execution time when not having debugger connected?

  • Oh Wow,

    This is super interesting, Just calling sd_ecb_block_encrypt() in FreeRTOS have different execution times even if it is configured to run in high priority. I will have to re-read your description and test few things. Will be back to you with hopefully somemore meaningful info.

  • I have measured that the call to sd_ecb_block_encrypt() and a block copy takes 15.8us in every case we have without running FreeRTOS, or when using that version of FreeRTOS while connected to a debugger, but in running state.

    Can you please define what a "running state" and without FreeRTOS means here? I think this is super interesting as it seems like FreeRTOS might be doing something it was not supposed to do in terms of masking non application specific interrupts. I will anyhow have to reproduce this in a minimalistic way. For that I need to understand what running state and without freertos means here? I am guessing it just means that you have tested with and without starting the freertos scheduler?

  • This is actually what confuses me a bit too, since it seems like most of the things that differ is totally unrelated. And what I find even more interesting is that just connecting with gdb, and type "mon reset" and "continue" will also "fix" the problem for the exact same executable.

    I have unfortunately no clue how to break it down further to isolate the issue, but happy to try to pinpoint it further if there are parts that may differ, for example if it is a clock issue, if I can get a clue of what peripherals or settings to look for, given what that function can do. But since that function to me is a black box, I don't know what peripherals it depends on.

    For the different scenarios we have we first need to look at what we are making. We are making a network stack, which has code running in interrupt level, and then a main loop internall. That network stack we are delivering to customers as a library. It is possible to run the main loop as a task in FreeRTOS, or use the main loop and process handling (based on contiki) as base for the program. That means, it is possible to run our library with or without FreeRTOS.

    But the part related to encryption is, even though quite deep down in the code, only dependent on this RADIO or TIMER0 interrupt running via the time slotting API waking up a lower priority interrupt via nrf_egu_task_trigger(...) to drop priority, and then do the calls. So I can not see how interrupts of lower priorities, or even thread level code, affects that call. And even just some in the same interrupt.

    I was thinking of checking if something else was running and taking up time using a lauterbach utrace, but didn't manage to get that running without also having SWD and debug session going, which means it won't trigger the issue.

    So this is actually where I'm lost. I want to isolate it, but I need input and help of what has the possibility to affect the runtime, to know what to look for. Or even if there is any information that I should provide here. And it seems to be affected by so seemingly unrelated parts, I'm not sure I can manage to reproduce it in an isolated environment either.

    I get it's hard to give a clean explanation for it with the information I provide, but I'm happy to investigate myself, just that some input or thoughts of where I should continue to look, even if it turns out to be a dead end, would be helpful.

  • I am sorry, I did not manage to look at it today, I will have to create a small setup to be able to test this, most likely I will do it early next week. I think you have given clear description and I think I would manage to try to test this time with and without FreeRTOS.

  • Max,

    I am having a bit of a hard time trying to create a simplistic project to test solely this ecb encrypt function. Instead of me trying to spend so much time on trying to configure my project to get to your, can you please attach your project here. Sorry for making you wait while I attempted to make a simplistic project to test this. But I am not able to use sd_ecb_block_encrypt outside peer manager and I do not want to test the latencies of sd_ecb_block_encrypt under peer manager umbrella as there might be many other things happening under the hood.

Related