nRF52832 SDK15.3: Flash erased at the end addresses, application runs but bootloader cannot be entered

Hello,

We have a problem that affects a few devices (< 0.1 %) in mass production. The devices are flashed, boot up and pass tests and get shipped to customers. Some customers will later report that the bootloader does not work. The customers are generally familiar with the devices and can DFU other identical devices they have, so we do not think this is a user error. 

When we do warranty return, we can repeat the user problem. When we read back the flash we note that the final addresses (0x6XXXX + ) are filled with "0xFF". The 0xFF does not start at a flash page boundary, but it is possible that the application has been written completely and programming connector would have been disconnected before writing the bootloader partially or completely. 

We do not have records of read back flash files from devices after production, our programming script runs verify step but it's possible that operator does not notice verification error and passes the device onwards. It would also be possible that the flash gets corrupted somehow or MBR/UICR containing the bootloader start address does not get written or they get overwritten later somehow.

To summarize:

  • A few of our devices in field have erased last flash addresses, leaving bootloader unusable while application runs fine.
    • Devices are supposed to enter bootloader with button press. Working devices enter bootloader on button press at boot, faulty devices enter application on button press at boot. Button itself works in application.
    • We have not tested to enter bootloader with the Buttonless DFU Service.
  • We suspect that this might be a problem with our flashing process, programmer is disconnected before flashing is complete and verification step is ignored by operator.
  • Another possibility is that UICR/MBR of firmware start addresses are wrong and Softdevice boots application directly. 
  • Third possibility is that flash gets erased after flashing, but we do not know what could be the mechanism.

Do you have insight on what might be the root cause of our problem and how to avoid it? So far we have considered adding checksum check of the bootloader into application, so the bootloader checks application CRC first and then the application checks the bootloader CRC.

Parents
  • Simple fix: Use mergehex tool to merge the production hex files into a single file, then flash in a single step. Will even be a little faster in production.

    I strongly suspect that the operators noticed parts passing test when the last flash steps where not completed yet, and thus skipped them to save time (and thus money).

    Random erase may be possible (e.g. brownout or software bug), but I would expect that to brick the devices instead of erasing the bootloader completely and with the main app still running just fine.

    Another indication would be FDS/Peer manager data. If FDS does not find where the bootloader is (or at least should be), it will put its data at end-of-flash. Thus one would expect to find some stuff at the very last flash pages that contain FDS/peer manager data.

  • Thank you for your reply.

    We're already using merge hex and flashing one hex file that contains soft device, application, bootloader and bootloader settings page. 

    We will check if there is FDS data in the end, although I would be very curious on why it would affect just a small portion of devices.

Reply Children
Related