This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

speed up flash programming time

Hi,

We are using Jlink and nrfjprog to program the nRF52832 flash via SWD interface. We have a combined hex file which is about 512k and takes about 12 seconds to flash it.

I tried to change the clock and seems not effective at all. I also search in the forum and cannot found clear solution on this topic.

Any suggestion can significant reduce the programming time? Our production is quite large (>100k per quarter), it would save a lot if the time can reduce. 

Parents
  • I had a similar problem and solved it by mass-updating devices through the native RADIO peripheral.

    The approach is that a preprogrammed firmware (called updater) is periodically waking up and listening on a channel to see if there are update packages in the air. If there are, it starts capturing and writing them to the flash, taking note of which offset+length ranges were received. Once it has all ranges received, it resets the entry point to the starting address and reboots. (The wakeup and listening interval is adjusted in the updater based on whether the device has an on/off switch, or if it comes with a battery and it is on after leaving the assembly house, etc.)

    Another device that is located at your facility has the actual target firmware and an additional one (called distributor) that is constantly looping over the target firmware's flash areas (in its own flash), assembles the update packages, and broadcasts them. A package contains an offset, the chunk size, and the payload (e.g. in 128-byte chunks). You will also have to transmit the overall firmware size and ideally an overall hash (e.g. crc32) as a start or end packet.

    Ideally, you implement the transmission so that there are no ACK packages sent from the updaters, to avoid collisions on the channels, and also have some security measures implemented to minimize the risk of unintended updates (e.g. you can place a public key in the updater and sign all packages in the distributor).

    Not sending ACK responses is not an issue, as the distributor continuously loops through its firmware chunks, and if the updater doesn't catch one, it will likely do in the next round.

    The updater/distributor`s linker could use a similar layout (below is an example for nRF52840):

    MEMORY
    {
        FLASH (rx) : ORIGIN = 0x000fc000, LENGTH = 0x02000
        RAM (rwx) : ORIGIN = 0x20000000, LENGTH = 0x10000
    }

    For nRF52832 it would look the following:

    MEMORY
    {
        FLASH (rx) : ORIGIN = 0x0007c000, LENGTH = 0x02000
        RAM (rwx) :  ORIGIN = 0x20000000, LENGTH = 0x10000
    }

    In your case, I would have one distributor (on one end of the channels) for the test firmware, another few for receiving test results along with the device id (on different channels), and a third one on the other end of the channels that distributes the final firmware.

    The test firmware would send its test results on a random channel selected from a predefined list (which is where the test listeners watch) and would wait for an ACK. If the ACK arrived, it goes ahead with updating to the final firmware.

    With this approach (at least regarding the update procedure) you are theoretically limited by the number of devices that can physically fit in the area you are broadcasting your firmwares in, without shadowing the signal too much, and by the number of collisions, the transmission of test results, and test-result-ACKs cause (which is easier to scale).

    Some additional thoughts:
    - Besides the overall hash of the firmware you send at the beginning or end of a distributor loop cycle, it makes sense to use 3 byte CRCs per packet (using the NRF_RADIO's registers), to further lower the risk of corruption in the final transmitted firmware. If a corruption is detected by the updater using the final hash, it should just re-receive all update packages from the beginning.
    - If you are worried about transmitting the firmware OTA in your local area, you can pseudo-encrypt it using the public key that is already in the updater (along with some other logic), or for better security, you can do an initial handshake with an additional device for a fixed key exchange (that can also be regenerated periodically) and use AES-128 or similar.
    - It might sound funny, but if the key exchange and test-result transmissions become a significant bottleneck, you can just put all devices on a moving belt (or cart for that matter) and move them from one place to another, where there are multiple distributors/key-exchangers/test-result-grabbers along the way.

    Also, a 512k hex file is not that big, for the bytes transferred, rather look for the size of the bin file (that is what actually lives on the flash).

    Below is an excerpt of code that copies the flash content from one area to another (in the aforementioned approach you don't have to copy from A to B, but it can be a good starting point):

    storage_init();
    
    if (storage.firmware_ready_to_copy) {
        uint32_t crc = crc32_compute((uint8_t *) BANK_1_START, storage.firmware_size, 0);
    
        if (crc == storage.firmware_crc) {
            uint32_t i, pages = (storage.firmware_size + (CODE_PAGE_SIZE - 1)) / CODE_PAGE_SIZE;
    
            do {
                for (i = 0; i < pages; i++) {
                    nrf_gpio_pin_clear(PIN_LED02);
                    nrf_nvmc_page_erase((uint32_t) (BANK_0_START + i * CODE_PAGE_SIZE));
                    nrf_gpio_pin_set(PIN_LED02);
                    nrf_nvmc_write_words(BANK_0_START + i * CODE_PAGE_SIZE, (const uint32_t *) (BANK_1_START + i * CODE_PAGE_SIZE), CODE_PAGE_SIZE / sizeof(uint32_t));
                }
            } while (crc32_compute((uint8_t *) BANK_0_START, storage.firmware_size, 0) != crc);
        }
    
        storage.firmware_ready_to_copy = 0;
    
        storage_save();
    }
    
    NVIC_SystemReset();
    

    You can start an app from a given address on the bus with cpu_start():

    static void __attribute__ ((noinline)) start_app(uint32_t addr) {
        __asm volatile(
        "ldr   r0, [%0]\n"
        "msr   msp, r0\n"
        "ldr   r0, [%0, #0x04]\n"
        "movs  r4, #0xFF\n"
        "sxtb  r4, r4\n"
        "mrs   r5, IPSR\n"
        "cmp   r5, #0x00\n"
        "bne   isr_abort\n"
        "mov   lr, r4\n"
        "bx    r0\n"
    
        "isr_abort:  \n"
        "mov   r5, r4\n"
        "mov   r6, r0\n"
        "movs  r7, #0x21\n"
        "rev   r7, r7\n"
        "push  {r4-r7}\n"
        "movs  r4, #0x00\n"
        "movs  r5, #0x00\n"
        "movs  r6, #0x00\n"
        "movs  r7, #0x00\n"
        "push  {r4-r7}\n"
        "movs  r0, #0xF9\n"
        "sxtb  r0, r0\n"
        "bx    r0\n"
        ".align\n"
        ::
        "r"(addr):
        "r0", "r4", "r5", "r6", "r7");
    }
    
    void cpu_start(uint32_t address) {
        NVIC->ICER[0] = 0xFFFFFFFF;
    #if defined(__NRF_NVIC_ISER_COUNT) && __NRF_NVIC_ISER_COUNT == 2
        NVIC->ICER[1] = 0xFFFFFFFF;
    #endif
    
        start_app(address);
    }
    

    Hope that helps,

    Kornél

Reply
  • I had a similar problem and solved it by mass-updating devices through the native RADIO peripheral.

    The approach is that a preprogrammed firmware (called updater) is periodically waking up and listening on a channel to see if there are update packages in the air. If there are, it starts capturing and writing them to the flash, taking note of which offset+length ranges were received. Once it has all ranges received, it resets the entry point to the starting address and reboots. (The wakeup and listening interval is adjusted in the updater based on whether the device has an on/off switch, or if it comes with a battery and it is on after leaving the assembly house, etc.)

    Another device that is located at your facility has the actual target firmware and an additional one (called distributor) that is constantly looping over the target firmware's flash areas (in its own flash), assembles the update packages, and broadcasts them. A package contains an offset, the chunk size, and the payload (e.g. in 128-byte chunks). You will also have to transmit the overall firmware size and ideally an overall hash (e.g. crc32) as a start or end packet.

    Ideally, you implement the transmission so that there are no ACK packages sent from the updaters, to avoid collisions on the channels, and also have some security measures implemented to minimize the risk of unintended updates (e.g. you can place a public key in the updater and sign all packages in the distributor).

    Not sending ACK responses is not an issue, as the distributor continuously loops through its firmware chunks, and if the updater doesn't catch one, it will likely do in the next round.

    The updater/distributor`s linker could use a similar layout (below is an example for nRF52840):

    MEMORY
    {
        FLASH (rx) : ORIGIN = 0x000fc000, LENGTH = 0x02000
        RAM (rwx) : ORIGIN = 0x20000000, LENGTH = 0x10000
    }

    For nRF52832 it would look the following:

    MEMORY
    {
        FLASH (rx) : ORIGIN = 0x0007c000, LENGTH = 0x02000
        RAM (rwx) :  ORIGIN = 0x20000000, LENGTH = 0x10000
    }

    In your case, I would have one distributor (on one end of the channels) for the test firmware, another few for receiving test results along with the device id (on different channels), and a third one on the other end of the channels that distributes the final firmware.

    The test firmware would send its test results on a random channel selected from a predefined list (which is where the test listeners watch) and would wait for an ACK. If the ACK arrived, it goes ahead with updating to the final firmware.

    With this approach (at least regarding the update procedure) you are theoretically limited by the number of devices that can physically fit in the area you are broadcasting your firmwares in, without shadowing the signal too much, and by the number of collisions, the transmission of test results, and test-result-ACKs cause (which is easier to scale).

    Some additional thoughts:
    - Besides the overall hash of the firmware you send at the beginning or end of a distributor loop cycle, it makes sense to use 3 byte CRCs per packet (using the NRF_RADIO's registers), to further lower the risk of corruption in the final transmitted firmware. If a corruption is detected by the updater using the final hash, it should just re-receive all update packages from the beginning.
    - If you are worried about transmitting the firmware OTA in your local area, you can pseudo-encrypt it using the public key that is already in the updater (along with some other logic), or for better security, you can do an initial handshake with an additional device for a fixed key exchange (that can also be regenerated periodically) and use AES-128 or similar.
    - It might sound funny, but if the key exchange and test-result transmissions become a significant bottleneck, you can just put all devices on a moving belt (or cart for that matter) and move them from one place to another, where there are multiple distributors/key-exchangers/test-result-grabbers along the way.

    Also, a 512k hex file is not that big, for the bytes transferred, rather look for the size of the bin file (that is what actually lives on the flash).

    Below is an excerpt of code that copies the flash content from one area to another (in the aforementioned approach you don't have to copy from A to B, but it can be a good starting point):

    storage_init();
    
    if (storage.firmware_ready_to_copy) {
        uint32_t crc = crc32_compute((uint8_t *) BANK_1_START, storage.firmware_size, 0);
    
        if (crc == storage.firmware_crc) {
            uint32_t i, pages = (storage.firmware_size + (CODE_PAGE_SIZE - 1)) / CODE_PAGE_SIZE;
    
            do {
                for (i = 0; i < pages; i++) {
                    nrf_gpio_pin_clear(PIN_LED02);
                    nrf_nvmc_page_erase((uint32_t) (BANK_0_START + i * CODE_PAGE_SIZE));
                    nrf_gpio_pin_set(PIN_LED02);
                    nrf_nvmc_write_words(BANK_0_START + i * CODE_PAGE_SIZE, (const uint32_t *) (BANK_1_START + i * CODE_PAGE_SIZE), CODE_PAGE_SIZE / sizeof(uint32_t));
                }
            } while (crc32_compute((uint8_t *) BANK_0_START, storage.firmware_size, 0) != crc);
        }
    
        storage.firmware_ready_to_copy = 0;
    
        storage_save();
    }
    
    NVIC_SystemReset();
    

    You can start an app from a given address on the bus with cpu_start():

    static void __attribute__ ((noinline)) start_app(uint32_t addr) {
        __asm volatile(
        "ldr   r0, [%0]\n"
        "msr   msp, r0\n"
        "ldr   r0, [%0, #0x04]\n"
        "movs  r4, #0xFF\n"
        "sxtb  r4, r4\n"
        "mrs   r5, IPSR\n"
        "cmp   r5, #0x00\n"
        "bne   isr_abort\n"
        "mov   lr, r4\n"
        "bx    r0\n"
    
        "isr_abort:  \n"
        "mov   r5, r4\n"
        "mov   r6, r0\n"
        "movs  r7, #0x21\n"
        "rev   r7, r7\n"
        "push  {r4-r7}\n"
        "movs  r4, #0x00\n"
        "movs  r5, #0x00\n"
        "movs  r6, #0x00\n"
        "movs  r7, #0x00\n"
        "push  {r4-r7}\n"
        "movs  r0, #0xF9\n"
        "sxtb  r0, r0\n"
        "bx    r0\n"
        ".align\n"
        ::
        "r"(addr):
        "r0", "r4", "r5", "r6", "r7");
    }
    
    void cpu_start(uint32_t address) {
        NVIC->ICER[0] = 0xFFFFFFFF;
    #if defined(__NRF_NVIC_ISER_COUNT) && __NRF_NVIC_ISER_COUNT == 2
        NVIC->ICER[1] = 0xFFFFFFFF;
    #endif
    
        start_app(address);
    }
    

    Hope that helps,

    Kornél

Children
No Data
Related