FOTA over BLE fails after upgrade to NCS 2.8.0

Hi,

I'm trying to implement firmware updates over BLE with mcuboot as boot loader.

I got it working with NCS 2.6.1, but I've upgraded to NCS 2.8.0 and now nRF Connect for iOS gives the error: Sending the request timed out.

I have the following configuration (of which I think it's important to the issue):

sysbuild.conf:

SB_CONFIG_BOOTLOADER_MCUBOOT=y
SB_CONFIG_BOOT_SIGNATURE_TYPE_ECDSA_P256=y
SB_CONFIG_BOOT_SIGNATURE_KEY_FILE="${APP_DIR}/priv.pem"

SB_CONFIG_PARTITION_MANAGER=y

SB_CONFIG_DFU_ZIP=y
SB_CONFIG_DFU_ZIP_APP=y

prj.conf:

CONFIG_BT_SMP=y

# FOTA configuration
# Configuration is basically the expansion of:
# CONFIG_NCS_SAMPLE_MCUMGR_BT_OTA_DFU=y
# CONFIG_NCS_SAMPLE_MCUMGR_BT_OTA_DFU_SPEEDUP=y
CONFIG_MCUMGR=y
CONFIG_IMG_ERASE_PROGRESSIVELY=y
CONFIG_NET_BUF=y
CONFIG_ZCBOR=y
CONFIG_CRC=y
CONFIG_MCUMGR_TRANSPORT_BT=y
CONFIG_MCUMGR_TRANSPORT_BT_CONN_PARAM_CONTROL=y
CONFIG_MCUMGR_TRANSPORT_BT_PERM_RW_ENCRYPT=y
CONFIG_IMG_MANAGER=y
CONFIG_STREAM_FLASH=y
CONFIG_FLASH_MAP=y
CONFIG_FLASH=y
CONFIG_MCUMGR_GRP_IMG=y
CONFIG_MCUMGR_GRP_OS=y
CONFIG_MCUMGR_GRP_OS_BOOTLOADER_INFO=y
CONFIG_MCUMGR_GRP_OS_MCUMGR_PARAMS=y
CONFIG_MCUMGR_TRANSPORT_BT_REASSEMBLY=y
CONFIG_BT_L2CAP_TX_MTU=498
CONFIG_BT_BUF_ACL_RX_SIZE=502
CONFIG_BT_BUF_ACL_TX_SIZE=251
CONFIG_BT_CTLR_DATA_LENGTH_MAX=251

Apart from shuffling a few config's from prj.conf to sysbuild.conf, the only change is that I needed to add 'CONFIG_MCUMGR_TRANSPORT_BT_PERM_RW_ENCRYPT' to allow access to the SMP characteristics as CONFIG_BT_SMP is enabled but I do not have MITM protection.

I can use nRFConnect for iOS to list the images, get bootloader info and reboot the device, so it seems that the basic SMP commands are working. Why does it error-out while trying to perform the actual update? Did I miss something while migrating to NCS 2.8.0?

Thanks in advance.

Kind regards,

Remco Poelstra

Parents
  • Hi,

    Which version of the nRF Connect for iOS application do you use?

    Have you tried to use Device Manager mobile application instead? Do you see any difference in result compared to nRF Connect for iOS?
    You can look at this testing steps when performing FOTA update on the nrf52 device.

    Apart from shuffling a few config's from prj.conf to sysbuild.conf, the only change is that I needed to add 'CONFIG_MCUMGR_TRANSPORT_BT_PERM_RW_ENCRYPT' to allow access to the SMP characteristics as CONFIG_BT_SMP is enabled but I do not have MITM protection.

    What happens if you do not add CONFIG_MCUMGR_TRANSPORT_BT_PERM_RW_ENCRYPT? Do you get the same error?

    Best regards,
    Dejan

  • Hi,

    I'm using nRF Connect for iOS version 2.7.13(40).

    I just tried Device Manager (1.8.1(3)) and it seems to have the same problem. It gets stuck on UPLOADING... after I press 'Start'. Other functions seem to work fine.

    I I remove CONFIG_MCUMGR_TRANSPORT_BT_PERM_RW_ENCRYPT I can't use the SMP functions at all. The device disconnects and reports an error (over RTT) that the requested authentication level can't be reached (which is an understandable error in my situation).

    During the last tests I had some logging intensive task running and I noticed that during the upload the device (also) logs 'mcuboot_util: Image index: 0, Swap type: none' and keeps logging the other task for a short while and then completely stops. It seems the processor completely hangs. There is no other output and running the firmware in the debugger also doesn't trigger a breakpoint or something. It simply all stops.

Reply
  • Hi,

    I'm using nRF Connect for iOS version 2.7.13(40).

    I just tried Device Manager (1.8.1(3)) and it seems to have the same problem. It gets stuck on UPLOADING... after I press 'Start'. Other functions seem to work fine.

    I I remove CONFIG_MCUMGR_TRANSPORT_BT_PERM_RW_ENCRYPT I can't use the SMP functions at all. The device disconnects and reports an error (over RTT) that the requested authentication level can't be reached (which is an understandable error in my situation).

    During the last tests I had some logging intensive task running and I noticed that during the upload the device (also) logs 'mcuboot_util: Image index: 0, Swap type: none' and keeps logging the other task for a short while and then completely stops. It seems the processor completely hangs. There is no other output and running the firmware in the debugger also doesn't trigger a breakpoint or something. It simply all stops.

Children
  • Hi,

    If you have any newer Android phone, you could try to test if the same problem appears when using corresponding Android applications.

    Best regards,
    Dejan

  • I'm sorry, I don't have a recent Android device.

    I think the problem is on the firmware side, as I can't explain the hanging of the firmware otherwise. That would explain why the mobile device can't continue the upload, though.

  • Hi,

    Remco Poelstra said:
    I just tried Device Manager (1.8.1(3)) and it seems to have the same problem. It gets stuck on UPLOADING... after I press 'Start'. Other functions seem to work fine.
    Remco Poelstra said:
    During the last tests I had some logging intensive task running and I noticed that during the upload the device (also) logs 'mcuboot_util: Image index: 0, Swap type: none' and keeps logging the other task for a short while and then completely stops. It seems the processor completely hangs. There is no other output and running the firmware in the debugger also doesn't trigger a breakpoint or something. It simply all stops.

    What happens if you remove all logging intensive tasks? 

    Could you provide full logs when using both nRF Connect for iOS and Device Manager?

    Best regards,
    Dejan

  • Hi,

    Thanks for suggesting to disable the logging intensive task. This allowed me to track down the issue to the following call:

    err = adc_read_async(adc, &sequence, &adc_signal);

    It doesn't matter whether I include the &adc_signal or not, if it's NULL the firmware upload still times out. So it seems that the ADC code running results in some kind of dead-lock in the system.

    I've included the complete ADC reading code, but it's basically a timer that calls adc_read_async at a 100 Hz interval. I'm unsure why this should cause a dead-lock when used in companion with the mcumgr code.

    #include <zephyr/kernel.h>
    #include <zephyr/drivers/adc.h>
    #include <zephyr/logging/log.h>
    LOG_MODULE_REGISTER(adc, LOG_LEVEL_DBG);
    
    #include "../common.h"
    
    #ifdef CONFIG_BOARD_T502534
    #define ADC_NODE DT_NODELABEL(adc) // ADC node from the device tree
    static const struct device *adc = DEVICE_DT_GET(ADC_NODE); //Data of ADC device specified in device tree
    static const struct adc_channel_cfg channel_cfgs[] = {
        DT_FOREACH_CHILD_SEP(ADC_NODE, ADC_CHANNEL_CFG_DT, (,))
    };
    #define CHANNEL_COUNT ARRAY_SIZE(channel_cfgs)
    #endif
    
    /* Define a variable of type adc_sequence and a buffer of type int16_t */
    static int16_t buf[CHANNEL_COUNT];
    static struct adc_sequence sequence = {
        .buffer = buf,
        .buffer_size = sizeof(buf),
        .resolution = 12,
        .oversampling = 0,
        // Optional
        .calibrate = true, //TODO: Will this cause the ADC to calibrate for each sample?? We don't want that probably
    };
    
    static struct k_poll_signal adc_signal;
    static struct k_poll_event adc_event;
    
    /**
     * ADC main loop called by the timer
     */
    static void adc_timer_handler(struct k_timer *timer_id)
    {
    #ifdef CONFIG_BOARD_T502534
        int err;
        err = adc_read_async(adc, &sequence, &adc_signal);
        if (err < 0) {
            LOG_ERR("Failed to initialize ADC read: %d", err);
            return;
        }
    #endif    
    }
    
    void adcInit(void) {
    #ifdef CONFIG_BOARD_T502534
        ASSERT_ERROR_AND_REBOOT(!device_is_ready(adc));
    
        for (size_t i = 0; i < CHANNEL_COUNT; i++) {
            sequence.channels |= BIT(channel_cfgs[i].channel_id);
            ASSERT_ERROR_AND_REBOOT(adc_channel_setup(adc, &channel_cfgs[i]));
        }
    
        // use a timer to precisely trigger ADC reads at 100hz
        static struct k_timer adc_timer;
        k_timer_init(&adc_timer, adc_timer_handler, NULL);
        k_timer_start(&adc_timer, K_MSEC(10), K_MSEC(10));
    
        k_poll_signal_init(&adc_signal);
        k_poll_event_init(&adc_event, K_POLL_TYPE_SIGNAL, K_POLL_MODE_NOTIFY_ONLY, &adc_signal);
    #endif
    }
    
    static void adc_report_error(int result) {
        switch(result) {
            case 0:
                // Success
            case -EINVAL: 
                LOG_WRN("Parameter with an invalid value has been provided");
                break;
            case -ENOMEM: 
                LOG_WRN("Provided buffer is to small");
                break;
            case -ENOTSUP: 
                LOG_WRN("Requested mode of operation is not supported");
                break;
            case -EBUSY: 
                LOG_WRN("Another sampling was triggered while the previous one was still in progress");
                // try a longer interval
                break;
            case -EAGAIN:
                LOG_WRN("ADC read returned EAGAIN, conversion not ready");
                // try blocking reads instead
                break;
            default:
                LOG_ERR("Error reading ADC (%d)", result);
        }
    }
    
    int getPositionBlocking(int16_t val[2], k_timeout_t timeout) {
        // wait for adc read to complete
        int ret = k_poll(&adc_event, 1, timeout);
        if(ret != 0) {
            // LOG_ERR("k_poll failed (%d)", ret);
            return -EAGAIN;
        }
    
        // check if adc has error flags
        if (0 != adc_event.signal->result) {
            adc_report_error(adc_event.signal->result);
            return adc_event.signal->result;
        }
    
        // inform callers of updated value, available at &buf
        val[0] = buf[0];
        val[1] = buf[1];
    
        // housekeeping at end of loop
        adc_event.signal->signaled = 0;
        adc_event.state = K_POLL_STATE_NOT_READY;
    
        return 0; // return success
    }
    
    void getPosition(int16_t val[2]) {
        val[0] = buf[0];
        val[1] = buf[1];
    }

  • Regarding the logs:

    I couldn't get Device manager to log anything, but I've attached logs of the firmware and of nRF Connect for iOS.

    SEGGER J-Link V7.94i - Real time terminal output
    SEGGER J-Link V11.0, SN=821001711
    Process: JLink.exe
    [00:00:14.754,455] <dbg> main: main: Hello world!
    --- 3 messages dropped ---
    [00:00:14.754,760] <dbg> fs_nvs: nvs_recover_last_ate: Recovering last ate from sector 0
    [00:00:14.760,223] <inf> fs_nvs: 2 Sectors of 4096 bytes
    [00:00:14.760,253] <inf> fs_nvs: alloc wra: 0, d68
    [00:00:14.760,253] <inf> fs_nvs: data wra: 0, 3f0
    [00:00:14.760,437] <inf> bt_sdc_hci_driver: SoftDevice Controller build revision: 
                                                fe 2c f9 6a 7f 36 22 2e  a0 79 c0 40 be 2c 03 20 |.,.j.6". .y.@.,. 
                                                40 c2 f3 32                                      |@..2             
    [00:00:14.763,580] <inf> bt_hci_core: HW Platform: Nordic Semiconductor (0x0002)
    [00:00:14.763,610] <inf> bt_hci_core: HW Variant: nRF52x (0x0002)
    [00:00:14.763,641] <inf> bt_hci_core: Firmware: Standard Bluetooth controller (0x00) Version 254.63788 Build 573996906
    [00:00:14.764,160] <inf> bt_hci_core: No ID address. App must call settings_load()
    [00:00:14.768,096] <inf> bt_hci_core: Identity: F3:A9:1D:26:35:79 (random)
    [00:00:14.768,127] <inf> bt_hci_core: HCI: version 6.0 (0x0e) revision 0x104e, manufacturer 0x0059
    [00:00:14.768,157] <inf> bt_hci_core: LMP: version 6.0 (0x0e) subver 0x104e
    [00:00:14.778,778] <dbg> pwm_nrfx: pwm_nrfx_set_cycles: channel 0, pulse 0, period 32000, prescaler: 0.
    [00:00:14.778,839] <dbg> pwm_nrfx: pwm_nrfx_set_cycles: channel 0, pulse 0, period 32000, prescaler: 0.
    [00:00:19.924,438] <dbg> ble: ble_init: BLE enabled.
    [00:00:19.925,933] <inf> ble: --- MY MAC ADDRESS: F3:A9:1D:26:35:79 (random) ---
    [00:00:19.925,933] <inf> ble: Known bonds:
    [00:00:19.926,025] <inf> ble: 0. 9C:25:95:6E:95:66 (public)
    [00:00:19.926,116] <inf> ble: 1. 84:AB:1A:A1:0E:BF (public)
    [00:00:19.926,147] <inf> ble: ---- total: 2 ----
    [00:00:19.930,206] <wrn> bt_hci_core: opcode 0x2027 status 0x12 
    [00:00:19.930,236] <err> bt_id: Failed to add IRK to controller
    [00:00:19.930,938] <wrn> bt_hci_core: opcode 0x2027 status 0x12 
    [00:00:19.930,969] <err> bt_id: Failed to add IRK to controller
    [00:00:21.931,488] <inf> ble: ---- Acceptance list ----
    [00:00:21.931,793] <inf> ble: 9C:25:95:6E:95:66 (public)
    [00:00:21.932,098] <inf> ble: 84:AB:1A:A1:0E:BF (public)
    [00:00:21.932,128] <inf> ble: ---- list length = 2 ----
    [00:00:21.932,128] <dbg> ble: advertise_with_acceptlist: Acceptlist has 2 entries
    [00:00:21.933,258] <dbg> ble: advertise_with_acceptlist: Advertising successfully started.
    <<<<connect>>>>
    [00:01:23.361,175] <dbg> ble: mtu_updated: Updated MTU: TX: 23 RX: 23 bytes
    [00:01:23.364,074] <inf> ble: Connected via BLE
    [00:01:23.364,685] <dbg> sc: commandHandler: Handling command: PairingModeSetState (26)
    [00:01:23.970,733] <inf> ble: Security changed: 84:AB:1A:A1:0E:BF (public) level 2
    [00:01:24.000,366] <wrn> bt_l2cap: Ignoring data for unknown channel ID 0x003a
    [00:01:24.030,334] <dbg> ble: mtu_updated: Updated MTU: TX: 527 RX: 498 bytes
    <<<<upload>>>>
    
    [13:47:32.0770] Normal: Scanner On.
    [13:47:32.2800] Normal: Device Scanned.
    [13:47:41.0990] Normal: Connected.
    [13:47:41.2280] Normal: Discovered SMP Service and 94FF08D8-7129-4446-8057-4A1AB3B2EBC1 Services.
    [13:47:41.2320] Normal: Discovered SMP Characteristic Characteristics for Service SMP Service.
    [13:47:41.2520] Normal: Discovered 94FFFA00-7129-4446-8057-4A1AB3B2EBC1, 94FFFA01-7129-4446-8057-4A1AB3B2EBC1, 94FFFA02-7129-4446-8057-4A1AB3B2EBC1, 94FFFA03-7129-4446-8057-4A1AB3B2EBC1, 94FFFA07-7129-4446-8057-4A1AB3B2EBC1, 94FFFA0A-7129-4446-8057-4A1AB3B2EBC1, 94FFFA0B-7129-4446-8057-4A1AB3B2EBC1, 94FFFA0C-7129-4446-8057-4A1AB3B2EBC1, 94FFFA0D-7129-4446-8057-4A1AB3B2EBC1, 94FFFA0E-7129-4446-8057-4A1AB3B2EBC1, 94FFFA0F-7129-4446-8057-4A1AB3B2EBC1, 94FFFA10-7129-4446-8057-4A1AB3B2EBC1, 94FFFA14-7129-4446-8057-4A1AB3B2EBC1, 94FFFA29-7129-4446-8057-4A1AB3B2EBC1, 94FFFA2A-7129-4446-8057-4A1AB3B2EBC1, 94FFFA2B-7129-4446-8057-4A1AB3B2EBC1, 94FFFA37-7129-4446-8057-4A1AB3B2EBC1, and 94FFFA3A-7129-4446-8057-4A1AB3B2EBC1 Characteristics for Service 94FF08D8-7129-4446-8057-4A1AB3B2EBC1.
    [13:47:41.2520] Normal: Discovered Client Characteristic Configuration Descriptors for Characteristic SMP Characteristic
    [13:47:41.2520] Normal: Discovered Client Characteristic Configuration Descriptors for Characteristic 94FFFA00-7129-4446-8057-4A1AB3B2EBC1
    [13:47:41.2530] Normal: Discovered Client Characteristic Configuration Descriptors for Characteristic 94FFFA01-7129-4446-8057-4A1AB3B2EBC1
    [13:47:41.2530] Normal: Discovered Client Characteristic Configuration Descriptors for Characteristic 94FFFA02-7129-4446-8057-4A1AB3B2EBC1
    [13:47:41.2530] Normal: Discovered Client Characteristic Configuration Descriptors for Characteristic 94FFFA03-7129-4446-8057-4A1AB3B2EBC1
    [13:47:41.2530] Normal: Discovered Client Characteristic Configuration Descriptors for Characteristic 94FFFA07-7129-4446-8057-4A1AB3B2EBC1
    [13:47:41.2530] Normal: Discovered Client Characteristic Configuration Descriptors for Characteristic 94FFFA0A-7129-4446-8057-4A1AB3B2EBC1
    [13:47:41.2540] Normal: Discovered Client Characteristic Configuration Descriptors for Characteristic 94FFFA0B-7129-4446-8057-4A1AB3B2EBC1
    [13:47:41.2540] Normal: Discovered Client Characteristic Configuration Descriptors for Characteristic 94FFFA0C-7129-4446-8057-4A1AB3B2EBC1
    [13:47:41.2540] Normal: Discovered Client Characteristic Configuration Descriptors for Characteristic 94FFFA0D-7129-4446-8057-4A1AB3B2EBC1
    [13:47:41.2540] Normal: Discovered Client Characteristic Configuration Descriptors for Characteristic 94FFFA0E-7129-4446-8057-4A1AB3B2EBC1
    [13:47:41.2540] Normal: Discovered Client Characteristic Configuration Descriptors for Characteristic 94FFFA0F-7129-4446-8057-4A1AB3B2EBC1
    [13:47:41.2540] Normal: Discovered Client Characteristic Configuration Descriptors for Characteristic 94FFFA10-7129-4446-8057-4A1AB3B2EBC1
    [13:47:41.2550] Normal: Discovered Client Characteristic Configuration Descriptors for Characteristic 94FFFA14-7129-4446-8057-4A1AB3B2EBC1
    [13:47:41.2550] Normal: Discovered Client Characteristic Configuration Descriptors for Characteristic 94FFFA29-7129-4446-8057-4A1AB3B2EBC1
    [13:47:41.2550] Normal: Discovered Client Characteristic Configuration Descriptors for Characteristic 94FFFA2A-7129-4446-8057-4A1AB3B2EBC1
    [13:47:41.2550] Normal: Discovered Client Characteristic Configuration Descriptors for Characteristic 94FFFA2B-7129-4446-8057-4A1AB3B2EBC1
    [13:47:41.2560] Normal: Discovered Client Characteristic Configuration Descriptors for Characteristic 94FFFA37-7129-4446-8057-4A1AB3B2EBC1
    [13:47:41.2560] Normal: Discovered Client Characteristic Configuration Descriptors for Characteristic 94FFFA3A-7129-4446-8057-4A1AB3B2EBC1
    [13:47:42.1490] Normal: Appearance changed from Generic to McuMgrDFU.
    [13:47:42.5680] Normal: Found valid Firmware in file:///private/var/mobile/Containers/Data/Application/73766C67-A4C0-4FD7-B430-DE6AFA4D74E4/Documents/dfu_application%202.zip for Device DFU McuMgr.
    [13:47:54.4600] Normal: Upgrade started with 1 image(s) using '.testOnly' mode
    [13:47:54.4600] Normal: Firmware Upgrade Started.
    [13:47:54.4610] Normal: State changed from none to requestMcuMgrParameters
    [13:47:54.4630] Normal: Central Manager ready
    [13:47:54.4630] Normal: Peripheral connected
    [13:47:54.5440] Normal: Device ready
    [13:47:54.6380] Normal: Mcu Manager parameters received (4 x 2475)
    [13:47:54.6380] Normal: State changed from requestMcuMgrParameters to bootloaderInfo
    [13:47:54.8160] Normal: Bootloader Info received (Name: MCUboot)
    [13:47:54.9070] Normal: Bootloader Mode received (Mode: Swap without scratch)
    [13:47:54.9080] Normal: State changed from bootloaderInfo to validate
    [13:47:54.9990] Normal: Image List response: Header: {"version": "1", "op": "1", "flags": 0, "length": 134, "group": 1, "seqNum": 215, "commandId": 0}, Payload: {"images" : {{"pending" : false, "slot" : 0, "active" : true, "permanent" : false, "version" : "1.0.0", "bootable" : true, "confirmed" : true, "hash" : 0x95343950CD4D75F1FCD4C70E51BAA7B6C451E615269D56151290131956123F24}}, "splitStatus" : 0}
    [13:47:54.9990] Normal: Scheduling upload (hash: 0xDB18C0FDE243BFC8546A04A2A7957C90D61CA79222120D15CB547B48077B2E8A) for image 0 (slot: 1)
    [13:47:55.0100] Normal: State changed from validate to upload
    [13:48:05.3490] Normal: Retry 1 for seq: 217
    [13:48:05.3510] Normal: Retry 1 for seq: 218
    [13:48:05.3550] Normal: Retry 1 for seq: 219
    [13:48:15.3570] Normal: Retry 2 for seq: 217
    [13:48:15.3570] Normal: Retry 2 for seq: 218
    [13:48:15.3640] Normal: Retry 2 for seq: 219
    [13:48:25.3620] Normal: Retry 3 for seq: 217
    [13:48:25.3630] Error: Request (SMPv2, group: image, seq: 217, command: upload) failed: Sending the request failed.
    [13:48:25.3630] Error: Sending the request failed.
    [13:48:25.3630] Error: Upload cancelled due to error: sendFailed
    [13:48:25.3630] Error: DFU failed: Sending the request failed.
    [13:48:25.3640] Error: DFU Failed with Error: Sending the request failed.
    [13:48:25.3660] Normal: Retry 3 for seq: 218
    [13:48:25.3710] Normal: Retry 3 for seq: 219
    [13:48:55.5200] Normal: Disconnected.
    [13:49:32.0610] Normal: Scanner Off.

Related