Hello,
We have a problem that affects a few devices (< 0.1 %) in mass production. The devices are flashed, boot up and pass tests and get shipped to customers. Some customers will later report that the bootloader does not work. The customers are generally familiar with the devices and can DFU other identical devices they have, so we do not think this is a user error.
When we do warranty return, we can repeat the user problem. When we read back the flash we note that the final addresses (0x6XXXX + ) are filled with "0xFF". The 0xFF does not start at a flash page boundary, but it is possible that the application has been written completely and programming connector would have been disconnected before writing the bootloader partially or completely.
We do not have records of read back flash files from devices after production, our programming script runs verify step but it's possible that operator does not notice verification error and passes the device onwards. It would also be possible that the flash gets corrupted somehow or MBR/UICR containing the bootloader start address does not get written or they get overwritten later somehow.
To summarize:
- A few of our devices in field have erased last flash addresses, leaving bootloader unusable while application runs fine.
- Devices are supposed to enter bootloader with button press. Working devices enter bootloader on button press at boot, faulty devices enter application on button press at boot. Button itself works in application.
- We have not tested to enter bootloader with the Buttonless DFU Service.
- We suspect that this might be a problem with our flashing process, programmer is disconnected before flashing is complete and verification step is ignored by operator.
- Another possibility is that UICR/MBR of firmware start addresses are wrong and Softdevice boots application directly.
- Third possibility is that flash gets erased after flashing, but we do not know what could be the mechanism.
Do you have insight on what might be the root cause of our problem and how to avoid it? So far we have considered adding checksum check of the bootloader into application, so the bootloader checks application CRC first and then the application checks the bootloader CRC.

