sd_ecb_block_encrypt() execution time

We have a custom protocol with strict deadlines in the radio stack between receiving a packet, decrypting it, processing encrypting an ack and send the ack.

For that reason, we have strict deadlines on sd_ecb_block_encrypt(), and it seems to work for every case, except now one.

In this case, we are running:

- nrf52840

- Softdevice s140 v7.2.0

- nrf5-sdk 17.0.2

- FreeRTOS 10.0.0, the version shipped with nrf5-sdk

Our setup runs all radio code from the radio interrupt in an allocated radio slot, and thus on priority 0. Except the calls to softdevice, in which we trigger an SWI and drops to interrupt priority 5 for that call, and has the deadline specified on a TIMER0 compare. If we miss the deadline, then we log and retries later. Otherwise, we don't have any other interrupts during that period

I have measured that the call to sd_ecb_block_encrypt() and a block copy takes 15.8us in every case we have without running FreeRTOS, or when using that version of FreeRTOS while connected to a debugger, but in running state. But when disconnecting the debugger and reset the CPU, many of those calls starts 26.8us each, in a rate so we miss the deadline when sending bigger packets.

Also have a version where we upgraded FreeRTOS to a later version, and the issue seems to be gone.

We are not running any concurrent bluetooth, or having any other code using the encyrption.

To measure it, I'm using a saleae logic analyzer and pull a pin low just before the call, and high afterwards, to be able to measure timings without debugger/trace. I can't really find a pattern for it, but it seems to affect when running multiple in a row. Note: In the plot low signal means call to sd_ecb_block_encrypt()



I would have preferred to provide more steps to reproduce for this issue, but by the nature of not being able to track any pattern or being able to look at it with a debugger make it lot harder. But feeling uncomfortable without knowing the reason for this issue, since we don't know when it will appear or how to get around it. And thus lost in what to test

So what could make sd_ecb_block_encrypt() increase in execution time when not having debugger connected?

Parents
  • I know that there was a fix inside sd_ecb_block_encrypt applied in S140v7.2.0 to solve a power consumption issue which is relevant when we use SEVONPEND bit. The inclusion of this fix seems to have another side affect of wakeup time being longer in many cases. I see that this fix inside the sd_ecb_block_encrypt is removed later in our repo was not being released in nRF5 SDK.

    I think you can have a workaround to temporarily disable the SEVONPEND while requesting to block encrypt, something like below.

    void ENCRYPT_IRQHandler(
        void)
    {
        int i;
        ENCRYPT_GPIO_PORT->OUTSET = 1 << ENCRYPT_GPIO_TRIG_PIN;
        for (i = 0; i < 20; i++) {
            /* Just fill with something */
            memset(&ecb_buf, i, sizeof(nrf_ecb_hal_data_t));
    
            /* Disable temporarily */
            SCB->SCR &= ~SCB_SCR_SEVONPEND_Msk;
    
            ENCRYPT_GPIO_PORT->OUTSET = 1 << ENCRYPT_GPIO_ENC_PIN;
            sd_ecb_block_encrypt(&ecb_buf);
            ENCRYPT_GPIO_PORT->OUTCLR = 1 << ENCRYPT_GPIO_ENC_PIN;
    
            /* Re-enable */
            SCB->SCR |= SCB_SCR_SEVONPEND_Msk;
        }
        ENCRYPT_GPIO_PORT->OUTCLR = 1 << ENCRYPT_GPIO_TRIG_PIN;
    }

    I have tested this on my end and it seems to work. Can you please test the above changes of code in your vendor\nRF5_SDK_17.0.2_d674dde\examples\peripheral\freertos-nrf-encrypt-timings-main\src\encrypt.c

  • Thanks. I'll try that tomorrow when I have access to the setup again. Looks promising.

  • Did you get a chance to test this Max? Just curious.

  • I'm sorry not. I got some more urgent issues here that I needed to fix. It is still next in the pipeline after those issues though. But will take a day or two longer.

    If your tests where deterministic, I feel comfortable that it will work, so I'm planning to go directly to test it in our main system.  But for that I need to set up at least two nodes running with correct versions.

  • I am confident that I tested this well on my end. But this is firmware workaround and we cannot be 100% sure unless you see the same. Good luck with your tests.

  • Hi,


    Sorry for the delays. But I have finally managed to test it. I haven't hooked up an oscilloscope yet, but I've tried without the fix and then with the fix in our main system where we got the packet loss due to missing the deadlines.

    I can confirm that the fix/workaround works fine. So no issue with missing the deadline we have. Have been sending a couple of hundred packets with 100% PDR it seems now. Before, I might have had one or two of those to squeeze through before.

    Thanks for the help :)

    One last question then:

    How long should we keep this workaround, and what should we look for in the release notes if it gets fixed?

Reply
  • Hi,


    Sorry for the delays. But I have finally managed to test it. I haven't hooked up an oscilloscope yet, but I've tried without the fix and then with the fix in our main system where we got the packet loss due to missing the deadlines.

    I can confirm that the fix/workaround works fine. So no issue with missing the deadline we have. Have been sending a couple of hundred packets with 100% PDR it seems now. Before, I might have had one or two of those to squeeze through before.

    Thanks for the help :)

    One last question then:

    How long should we keep this workaround, and what should we look for in the release notes if it gets fixed?

Children
No Data
Related