AES-GCM IV Sizes failing with nRF54 and CRACEN Backend

Hello,

We have an application that uses AES-GCM for AEAD with a 128-bit key and an 64 bit initialization vector (IV). Using the psa_aead_encrypt API seems to work well for us on the nRF52840 where I believe we are using the Oberon backend currently. However, when using the same code on an nRF54L15, the psa_aead_encrypt function is returning an error, -134, PSA_ERROR_NOT_SUPPORTED. From the PSA documentation I saw that a possible cause of this error was an unsupported IV length. As the default AES-GCM length is 96 bits, I tried adjusting our length to match that, and found that the encryption succeeded.

It seems then that the IV length of 64 bits is causing the psa_aead_function to fail on the nRF54L15. I did also try 88 and 104 bit IVs as well and got the same error. It is my understanding that 96 bits is the recommended and default IV length for AES-GCM, but that other lengths should be supported. As I mention, the 64 bit length works for us on the nRF52 with the Connect SDK, and that length also for on the 52 with the nRF5 SDK.

I'm not entirely clear which module would be causing this failure, but I am looking to know if support for other IV lengths is expected to work at this time or if support is expected to be added in the future. As far as I can tell from the generated configs, we are using the CRACEN security backend. My understanding is that we could use Oberon on the nRF54, which may be a workaround for us, but I was surprised to encounter this different limitation on the CRACEN backend.

Thank you,

Ben

Parents Reply Children
  • Hello,

    Since CRACEN will not work for our application's use of AES-GCM at this time, I have used the following workaround. Enabling both the CRACEN and Oberon backends, and disabling the CRACEN AEAD component specifically got everything working.

    CONFIG_PSA_CRYPTO_DRIVER_CRACEN=y
    CONFIG_PSA_CRYPTO_DRIVER_OBERON=y
    CONFIG_PSA_USE_CRACEN_AEAD_DRIVER=n
    

    I originally tried to disable CRACEN and enable only Oberon, but I was running into an error with EC key generation failing elsewhere in our application. I was not expecting this since the Oberon backend is working fine with the same application on an nRF52 based board. The symptom was similar to these other DevZone posts [1], [2], but none of the config option changes recommended there resolved my issue. Enabling both backends got all of our crypto operations working.

    So our application is all working at this point with the workaround of using both backends. As development continues, we would certainly like to consolidate down to using one backend, ideally CRACEN to make use of the hardware-acceleration.

    Please let me know if is possible that a future update NCS version could include a change that would support non-default IVs on CRACEN. If it is not possible due to the hardware implementation, or just unlikely to be pursued, we will look at only using Oberon.

    Thanks,

    Ben

  • Hi Ben,

    There is a hardware limitation that prevents non-standard IV lengths in nRF54L15. But we are looking into this and wonder if you can share a bit about your use-case and why your application need a 64 bit IV?

  • Hey Einar,

    This is just an internally defined protocol that we implemented several years ago. It could be changed to use a 96 bit IV, or to use 0 padding to get to 96 bits instead of feeding the smaller IV into AES-GCM directly. That change would just require updating code on a good number of fielded devices, so the cost of just using Oberon is probably less for us than making the change on all the other systems to accommodate this limitation of CRACEN at least for the time being.

    We had originally chosen 64 bits just to save a few bytes in a size-constrained communication environment. The additional time of AES-GCM running a hash to increase the IV size is inconsequential to us. We send few enough messages before key rotation such that reuse of a random IV even at the smaller size is sufficiently unlikely. We could have done the 0 padding prior to feeding in the IV as mentioned above, but it did not occur to anyone at the time that security stacks would not have support for other IV sizes.

    Thanks

  • Hi Ben,

    I see, thank you for sharing the background. It will not be possible to support IV different than 96 bit in CRACEN, and we plan to update the documentation to make that clear. So the alternatives are to either use Oberon as you do now or modify/update your protocol. (Note that Oberon on the nRF54L is not fully supported, so there will be some limitations such as not being able to use HW protected keys).

  • After some experimentation, I found out that it is indeed possible to perform AES-GCM encryption/decryption with CRACEN with any-sized IV. You just need to add extra DMA descriptors to perform that additional GHASH step in the beginning of the operation to calculate the correct J0 (initial counter), and then additionally make sure that J0 is used at the end when encrypting the tag in the final step.

    I am attaching code that performs arbitrary AES-GCM encryption/decryption using the CRACEN peripheral directly (the existing CRACEN PSA library was to messy too modify for my test). If you want to use it together with nRF Connect SDK, you should probably wrap CRACEN usage between `cracen_acquire` + `nrf_security_mutex_lock(cracen_mutex_symmetric)` and `cracen_release` + `nrf_security_mutex_unlock(cracen_mutex_symmetric)` and remove my ENABLE register writes.

    It would be nice to have code that uses this approach in the nRF Connect SDK, probably behind a Kconfig flag because implementations are recommended by the GCM standard to by default restrict IV size to 12 bytes, to promote interoperability.

    #include <stdbool.h>
    #include <stdint.h>
    #include <stddef.h>
    #include <stdlib.h>
    #include <nrf54l15.h>
    
    #define DMATAG_BA411	      (1)
    #define DMATAG_CONFIG(offset) ((1 << 4) | (offset << 8))
    #define DMATAG_LAST		    (1 << 5)
    #define DMATAG_DATATYPE_HEADER	    (1 << 6)
    #define DMATAG_DATATYPE_REFERENCE   (3 << 6)
    #define DMATAG_INVALID_BYTES_MASK   0x1F
    #define DMATAG_INVALID_BYTES_OFFSET 8
    #define DMATAG_IGN(sz)		    ((sz) << DMATAG_INVALID_BYTES_OFFSET)
    
    /** Mode Register value for context loading */
    #define BA411_MODEID_CTX_LOAD (1u << 4)
    /** Mode Register value for context saving */
    #define BA411_MODEID_CTX_SAVE (1u << 5)
    
    #define BA411_MODEID_ECB	 0
    #define BA411_MODEID_CBC	 1
    #define BA411_MODEID_CTR	 2
    #define BA411_MODEID_GCM     6
    #define BA411_MODEID_XTS	 7
    #define BA411_MODEID_CHACH20 8
    
    /* BA411E-AES and BA419-SM4 Config register -> ModeOfOperation [16:8] */
    #define CMDMA_BA411_MODE_SET(modeid) (1 << (8 + (modeid)))
    
    /** BA411E-AES Config register -> KeySel[1:0] = [7:6], KeySel[4:2] = [30:28] */
    #define KEYREF_BA411E_HWKEY_CONF(index) ((((index)&0x3) << 6) | ((((index) >> 2) & 0x7) << 28))
    
    #define CM_CFG_DECRYPT 1
    #define CM_CFG_ENCRYPT 0
    
    #define DMA_LAST_DESCRIPTOR ((struct sxdesc *)1)
    #define DMA_REALIGN	    (1 << 29)
    #define DMA_DISCARD	    (1 << 30)
    
    /* Enable interrupts showing that an operation finished or aborted.
     * For that, we're interested in :
     *     - Fetcher DMA error (bit: 2)
     *     - Pusher DMA error (bit: 5)
     *     - Pusher DMA stop (bit: 4)
     *
     */
    #define CMDMA_INTMASK_EN ((1 << 2) | (1 << 5) | (1 << 4))
    
    #define ROUND_UP_16(v) (((v) + 15U) & ~15U)
    #define BITS_PER_BYTE 8
    
    /** A cryptomaster DMA descriptor */
    struct sxdesc {
        char *addr;
        struct sxdesc *next;
        uint32_t sz;
        uint32_t dmatag;
    };
    
    struct sx_ba411_cmdma_tags {
        uint32_t cfg;
        uint32_t iv_or_state;
        uint32_t key;
        uint32_t key2;
        uint32_t aad;
        uint32_t data;
    };
    
    static const struct sx_ba411_cmdma_tags ba411tags = {
        .cfg = DMATAG_BA411 | DMATAG_CONFIG(0),
        .iv_or_state = DMATAG_BA411 | DMATAG_CONFIG(0x28),
        .key = DMATAG_BA411 | DMATAG_CONFIG(0x08),
        .key2 = DMATAG_BA411 | DMATAG_CONFIG(0x48),
    	.aad = DMATAG_BA411 | DMATAG_DATATYPE_HEADER,
        .data = DMATAG_BA411
    };
    
    static void turn_on() {
        NRF_CRACEN->ENABLE = CRACEN_ENABLE_CRYPTOMASTER_Msk;
        NRF_CRACENCORE->CRYPTMSTRDMA.INTEN = 0;
        NRF_CRACENCORE->CRYPTMSTRDMA.INTSTATCLR = ~0U;
        NRF_CRACENCORE->CRYPTMSTRDMA.INTEN = CMDMA_INTMASK_EN;
        NRF_CRACENCORE->CRYPTMSTRDMA.CONFIG = CRACENCORE_CRYPTMSTRDMA_CONFIG_FETCHCTRLINDIRECT_Msk | CRACENCORE_CRYPTMSTRDMA_CONFIG_PUSHCTRLINDIRECT_Msk;
    }
    
    static void turn_off() {
        NRF_CRACEN->ENABLE = 0;
    }
    
    static void run(const struct sxdesc *in, const struct sxdesc *out) {
        asm volatile ("" ::: "memory");
        NRF_CRACENCORE->CRYPTMSTRDMA.FETCHADDRLSB = (uintptr_t)in;
        NRF_CRACENCORE->CRYPTMSTRDMA.PUSHADDRLSB = (uintptr_t)out;
        NRF_CRACEN->EVENTS_CRYPTOMASTER = 0;
    
        NRF_CRACENCORE->CRYPTMSTRDMA.START = CRACENCORE_CRYPTMSTRDMA_START_STARTFETCH_Msk | CRACENCORE_CRYPTMSTRDMA_START_STARTPUSH_Msk;
        while (!NRF_CRACEN->EVENTS_CRYPTOMASTER) {
        }
    
        if (NRF_CRACENCORE->CRYPTMSTRDMA.INTSTATRAW != 0x12 || NRF_CRACENCORE->CRYPTMSTRDMA.STATUS != 0) {
            abort();
        }
        NRF_CRACENCORE->CRYPTMSTRDMA.INTSTATCLR = ~0U;
    }
    
    static void load_mask(uint32_t mask) {
        struct sxdesc in[1] = {
            {
                .addr = (char *)&mask,
                .next = DMA_LAST_DESCRIPTOR,
                .sz = sizeof(mask) | DMA_REALIGN,
                .dmatag = DMATAG_BA411 | DMATAG_CONFIG(0x68) | DMATAG_LAST
            }
        };
        struct sxdesc out[1] = {
            {
                .addr = NULL,
                .next = DMA_LAST_DESCRIPTOR,
                .sz = DMA_REALIGN,
                .dmatag = DMATAG_LAST
            }
        };
        run(in, out);
    }
    
    static void aes_gcm_crypt_12_byte_iv(uint32_t random_mask, bool is_decrypt, const uint8_t *key, size_t key_len, const uint8_t iv[12], const uint8_t *aad, size_t aad_len, const uint8_t *data_in, size_t data_len, uint8_t *data_out, uint8_t tag[16]) {
        uint32_t cfg = CMDMA_BA411_MODE_SET(BA411_MODEID_GCM) |
            KEYREF_BA411E_HWKEY_CONF(0) | // 1=MKEK, 2=MEXT
            is_decrypt // direction (false = encryption, true = decryption)
            ;
    
        uint64_t lengths[2] = {
            __builtin_bswap64(aad_len * 8),
            __builtin_bswap64(data_len * 8)
        };
    
        struct sxdesc in[6] = {
            {
                .addr = (char *)&cfg,
                .next = &in[1],
                .sz = sizeof(cfg) | DMA_REALIGN,
                .dmatag = ba411tags.cfg
            },
            {
                .addr = (char *)key,
                .next = &in[2],
                .sz = key_len | DMA_REALIGN,
                .dmatag = ba411tags.key
            },
            {
                .addr = (char *)iv,
                .next = &in[3],
                .sz = 12 | DMA_REALIGN,
                .dmatag = ba411tags.iv_or_state
            },
            {
                .addr = (char *)aad,
                .next = &in[4],
                .sz = ROUND_UP_16(aad_len) | DMA_REALIGN,
                .dmatag = ba411tags.aad | DMATAG_IGN(ROUND_UP_16(aad_len) - aad_len)
            },
            {
                .addr = (char *)data_in,
                .next = &in[5],
                .sz = ROUND_UP_16(data_len) | DMA_REALIGN,
                .dmatag = ba411tags.data | DMATAG_IGN(ROUND_UP_16(data_len) - data_len)
            },
            {
                .addr = (char *)lengths,
                .next = DMA_LAST_DESCRIPTOR,
                .sz = sizeof(lengths) | DMA_REALIGN,
                .dmatag = ba411tags.data | DMATAG_LAST
            }
        };
    
        struct sxdesc out[4] = {
            {
                .addr = NULL,
                .next = &out[1],
                .sz = ROUND_UP_16(aad_len) | DMA_DISCARD,
                .dmatag = 0
            },
            {
                .addr = (char *)data_out,
                .next = &out[2],
                .sz = data_len,
                .dmatag = 0
            },
            {
                .addr = NULL,
                .next = &out[3],
                .sz = ROUND_UP_16(data_len) - data_len | DMA_DISCARD,
                .dmatag = 0
            },
            {
                .addr = (char *)tag,
                .next = DMA_LAST_DESCRIPTOR,
                .sz = 16,
                .dmatag = 0
            }
        };
    
        turn_on();
        load_mask(random_mask);
        run(in, out);
        turn_off();
    }
    
    /**
     * Performs AES-GCM encryption or decryption.
     *
     * Any CRACEN mutex is not taken, so this code assumes that CRACEN is available to use throughout the call.
     *
     * @param random_mask A random 32-bit value for attack counter measurements, should be unique per call
     * @param is_decrypt true if decrypt, false if encrypt
     * @param key The AES key to use
     * @param key_len The length of the AES key in bytes, must be 16 or 32
     * @param iv The IV to use for this operation
     * @param iv_len The length of IV in bytes, must be at least 1
     * @param aad The additional info
     * @param aad_len The number of bytes in the additional info
     * @param data_in The plaintext if encryption and ciphertext if decryption
     * @param data_len The ciphertext if encryption and plaintext if decryption
     * @param data_out Length of the plaintext/ciphertext data in bytes
     * @param tag Where to place the 16 byte tag output. For decryption, the caller must compare this tag with the supplied tag.
     */
    void aes_gcm_crypt(uint32_t random_mask, bool is_decrypt, const uint8_t *key, size_t key_len, const uint8_t *iv, size_t iv_len, const uint8_t *aad, size_t aad_len, const uint8_t *data_in, size_t data_len, uint8_t *data_out, uint8_t tag[16]) {
        if (iv_len == 12) {
            aes_gcm_crypt_12_byte_iv(random_mask, is_decrypt, key, key_len, iv, aad, aad_len, data_in, data_len, data_out, tag);
            return;
        }
    
        turn_on();
        load_mask(random_mask);
    
        uint32_t key_conf = KEYREF_BA411E_HWKEY_CONF(0); // 1=MKEK, 2=MEXT
    
        uint32_t cfg;
    
        struct sxdesc in_base[2] = {
            {
                .addr = (char *)&cfg,
                .next = &in_base[1],
                .sz = sizeof(uint32_t) | DMA_REALIGN,
                .dmatag = ba411tags.cfg
            },
            {
                .addr = (char *)key,
                .next = NULL,
                .sz = key_len | DMA_REALIGN,
                .dmatag = ba411tags.key
            }
        };
    
        /*
         * First step: compute J0 by using the GHASH functionality of CRACEN.
         */
        uint32_t counter0[4];
        {
            cfg = CMDMA_BA411_MODE_SET(BA411_MODEID_GCM) | key_conf | CM_CFG_DECRYPT | BA411_MODEID_CTX_SAVE;
    
            uint64_t iv_len_bits_big_endian[2] = {0, __builtin_bswap64(iv_len * BITS_PER_BYTE)};
            struct sxdesc in[2] = {
                {
                    .addr = (char *)iv,
                    .next = &in[1],
                    .sz = ROUND_UP_16(iv_len) | DMA_REALIGN,
                    .dmatag = ba411tags.aad | DMATAG_IGN(ROUND_UP_16(iv_len) - iv_len)
                },
                {
                    .addr = (char *)iv_len_bits_big_endian,
                    .next = DMA_LAST_DESCRIPTOR,
                    .sz = 16 | DMA_REALIGN,
                    .dmatag = ba411tags.aad | DMATAG_LAST
                }
            };
            /*
             * The output is laid out as follows:
             * 1. Copy of the input to the GHASH function (discarded).
             * 2. Dummy IV output (discarded).
             * 3. The output from the GHASH function.
             */
            struct sxdesc out[2] = {
                {
                    .addr = NULL,
                    .next = &out[1],
                    .sz = ROUND_UP_16(iv_len) + 16 + 16 | DMA_DISCARD,
                    .dmatag = 0
                },
                {
                    .addr = (char *)counter0,
                    .next = DMA_LAST_DESCRIPTOR,
                    .sz = 16,
                    .dmatag = 0
                }
            };
            in_base[1].next = in;
            run(in_base, out);
        }
    
        /*
         * Second step: process all AAD, data, and the final len(A) || len(C) block.
         */
        {
            uint32_t counter0_saved_lsw = counter0[3];
            counter0[3] = __builtin_bswap32(__builtin_bswap32(counter0[3]) + 1);
    
            cfg = CMDMA_BA411_MODE_SET(BA411_MODEID_GCM) | key_conf | is_decrypt | BA411_MODEID_CTX_LOAD | BA411_MODEID_CTX_SAVE;
    
            uint64_t lengths[2] = {
                __builtin_bswap64(aad_len * BITS_PER_BYTE),
                __builtin_bswap64(data_len * BITS_PER_BYTE)
            };
    
            /*
             * The input is laid out as follows:
             * 1. "state" contains 16 byte current IV followed by 16 byte current GHASH value.
             *    The GHASH value is not loaded and will hence be left zero-initialized.
             * 2. AAD, processed only with GHASH.
             * 3. Data, processed by both AES-CTR and GHASH.
             * 4. The final len(A) || len(C) block, processed only with GHASH.
             */
            struct sxdesc in[4] = {
                {
                    .addr = (char *)counter0,
                    .next = &in[1],
                    .sz = 16 | DMA_REALIGN,
                    .dmatag = ba411tags.iv_or_state
                },
                {
                    .addr = (char *)aad,
                    .next = &in[2],
                    .sz = ROUND_UP_16(aad_len) | DMA_REALIGN,
                    .dmatag = ba411tags.aad | DMATAG_IGN(ROUND_UP_16(aad_len) - aad_len)
                },
                {
                    .addr = (char *)data_in,
                    .next = &in[3],
                    .sz = ROUND_UP_16(data_len) | DMA_REALIGN,
                    .dmatag = ba411tags.data | DMATAG_IGN(ROUND_UP_16(data_len) - data_len)
                },
                {
                    .addr = (char *)lengths,
                    .next = DMA_LAST_DESCRIPTOR,
                    .sz = sizeof(lengths) | DMA_REALIGN,
                    .dmatag = ba411tags.aad | DMATAG_LAST
                }
            };
            /**
             * The output is laid out as follows:
             * 1. Copy of the AAD input to the GHASH function, including padding (discarded).
             * 2. Encrypted/decrypted data.
             * 3. Final padding of encrypted/decrypted data (discarded).
             * 4. Copy of the len(A) || len(C) block (discarded).
             * 5. IV output at this state (discarded).
             * 6. GHASH output at this state.
             */
            struct sxdesc out[4] = {
                {
                    .addr = NULL,
                    .next = &out[1],
                    .sz = ROUND_UP_16(aad_len) | DMA_DISCARD,
                    .dmatag = 0,
                },
                {
                    .addr = (char *)data_out,
                    .next = &out[2],
                    .sz = data_len,
                    .dmatag = 0
                },
                {
                    .addr = NULL,
                    .next = &out[3],
                    .sz = ROUND_UP_16(data_len) - data_len + 16 + 16 | DMA_DISCARD,
                    .dmatag = 0
                },
                {
                    .addr = (char *)tag,
                    .next = DMA_LAST_DESCRIPTOR,
                    .sz = 16,
                    .dmatag = 0
                }
            };
            in_base[1].next = in;
            run(in_base, out);
    
            counter0[3] = counter0_saved_lsw;
        }
    
        /*
         * Final step: Use a normal AES-CTR operation to encrypt the tag using J0 as counter block.
         */
        {
            cfg = CMDMA_BA411_MODE_SET(BA411_MODEID_CTR) | key_conf;
    
            struct sxdesc in[2] = {
                {
                    .addr = (char *)counter0,
                    .next = &in[1],
                    .sz = 16 | DMA_REALIGN,
                    .dmatag = ba411tags.iv_or_state
                },
                {
                    .addr = (char *)tag,
                    .next = DMA_LAST_DESCRIPTOR,
                    .sz = 16 | DMA_REALIGN,
                    .dmatag = ba411tags.data | DMATAG_LAST
                }
            };
            struct sxdesc out[1] = {
                {
                    .addr = (char *)tag,
                    .next = DMA_LAST_DESCRIPTOR,
                    .sz = 16,
                    .dmatag = 0
                }
            };
            in_base[1].next = in;
            run(in_base, out);
        }
    
        turn_off();
    }
    

Related