Azure IoT Hub DPS library seems to always reprovision our device

I am working with SDK: 2.5.0 and toolchain 2.5.0.
I am using the new nRF7002DK.

I created an application from the azure_iot_hub sample.

I am using the DPS for my provisioning.
It is important to note that we haven't modified the code other than adding log to help us.

I am experiencing the following issues:

During azure_iot_hub_dps_init(), we are correctly enabling the setting module, and the load is a success:

[00:00:07.963,104] <wrn> azure_iot_hub_dps: azure_iot_hub_dps_init
[00:00:07.963,134] <inf> azure_iot_hub_dps: No registration ID provided, using ID from Kconfig: device-01
[00:00:07.963,165] <inf> azure_iot_hub_dps: Setting DPS registration ID: device-01
[00:00:07.963,226] <inf> azure_iot_hub_dps: No ID scope provided, using ID scope from Kconfig: ********
[00:00:07.963,256] <inf> azure_iot_hub_dps: Setting DPS ID scope: ********
[00:00:07.963,378] <inf> azure_iot_hub_dps: Assigned device ID found:
[00:00:07.963,531] <inf> azure_iot_hub_dps: Assigned device ID length found: 9
[00:00:07.963,714] <inf> azure_iot_hub_dps: Azure IoT Hub hostname found:
[00:00:07.963,928] <inf> azure_iot_hub_dps: Azure IoT Hub hostname length found: 29
[00:00:07.963,958] <inf> azure_iot_hub_dps: Settings fully loaded
[00:00:07.963,989] <inf> azure_iot_hub_dps: State transition: DPS_STATE_UNINIT --> DPS_STATE_DISCONNECTED
[00:00:07.964,019] <inf> azure_iot_hub_sample: DPS registration status: AZURE_IOT_HUB_DPS_REG_STATUS_NOT_STARTED

During the azure_iot_hub_dps_start phase. The global dps_reg_ctx was not updated with the value. I am new, but could it be some thread caching problem?

[00:00:07.964,019] <wrn> azure_iot_hub_dps: azure_iot_hub_dps_start
[00:00:07.964,050] <wrn> azure_iot_hub_dps: Settings not found

Here's the code I added to get that output:

	if ((az_span_size(dps_reg_ctx.assigned_hub) > 0) &&
	    (az_span_size(dps_reg_ctx.assigned_device_id) > 0)) {
		LOG_WRN("Settings found");
		LOG_INF("Device \"%.*s\" is assigned to IoT hub: %.*s",
			az_span_size(dps_reg_ctx.assigned_device_id),
			(char *)az_span_ptr(dps_reg_ctx.assigned_device_id),
			az_span_size(dps_reg_ctx.assigned_hub),
			(char *)az_span_ptr(dps_reg_ctx.assigned_hub));

		LOG_INF("To re-register, call azure_iot_hub_dps_reset() first");

		dps_reg_ctx.status = AZURE_IOT_HUB_DPS_REG_STATUS_ASSIGNED;

		return -EALREADY;
	} else {
		LOG_WRN("Settings not found");
		LOG_INF("Device \"%.*s\" is assigned to IoT hub: %.*s",
			az_span_size(dps_reg_ctx.assigned_device_id),
			(char *)az_span_ptr(dps_reg_ctx.assigned_device_id),
			az_span_size(dps_reg_ctx.assigned_hub),
			(char *)az_span_ptr(dps_reg_ctx.assigned_hub));
	}


After that, we end up straight into the provisioning process:

[00:00:07.964,233] <inf> azure_iot_hub_dps: User name (size: 58): ******/registrations/device-01/api-version=2019-03-31
[00:00:07.964,263] <inf> mqtt_helper: State transition: MQTT_STATE_UNINIT --> MQTT_STATE_DISCONNECTED
[00:00:07.964,324] <inf> mqtt_helper: Resolving IP address for global.azure-devices-provisioning.net
[00:00:08.548,095] <inf> mqtt_helper: IPv4 Address found 52.228.85.227 (AF_INET)


Has anyone here already experienced the same issues?
Am I doing something wrong?

We have already experienced much trouble using this example on our nRF7002DK.
Has this example been thoroughly tested before releasing it to us?

Thanks for any advice regarding this issue. I am a beginner here, and any shared knowledge will be helpful.

Cheers!

  • Hi Vincent

    Yes, the Azure hub IoT sample should be tested for the nRF7002 and work out of the box. I see that your log reports that no registration ID has been added, so can you confirm that you've set the required configurations in your project required to use DPS on your end? 

    If DPS is used, use the Kconfig fragment found in the overlay-dps.conf file and change the desired configurations there. As an example, the following should compile with DPS for nRF7002DK:

    west build -p -b nrf7002dk_nrf5340_cpuapp -- -DOVERLAY_CONFIG=overlay-dps.conf
    

    Best regards,

    Simon

  • Hi Simonr,

    Thank you for your prompt response and the suggestions regarding the Azure IoT Hub and DPS configuration for the nRF7002. I appreciate your assistance.

    I want to clarify a few points that might have been overlooked in my initial query. From the log, it is evident that the registration ID is being correctly pulled from Kconfig, as indicated by these lines:

    [00:00:07.963,134] <inf> azure_iot_hub_dps: No registration ID provided, using ID from Kconfig: device-01
    [00:00:07.963,165] <inf> azure_iot_hub_dps: Setting DPS registration ID: device-01

    This confirms that the DPS overlay is active and functioning. However, the crux of the issue lies elsewhere. Even when the Azure IoT Hub information is saved and retrievable from the Zephyr setting module (under the key DPS_SETTINGS_KEY "azure_iot_hub"), the device persists in processing through the DPS service, which is not the expected behaviour.

    The primary concern revolves around the azure_iot_hub_dps_start function in azure_iot_hub_dps.c, where the DPS is expected to be skipped if saved Azure IoT Hub information is found. Despite this, the dps_reg_ctx global variable doesn't seem to update accordingly in the DPS thread, suggesting possible issues with thread synchronization or variable caching.

    Additionally, the azure_iot_hub_connect function in azure_iot_hub.c seems to unnecessarily redo the DPS provisioning process, which should be redundant in this context.

    I believe these insights point towards a more specific area of the problem. I would greatly appreciate it if you could provide guidance or suggestions on these particular aspects:

    1. Ensuring proper synchronization and caching mechanisms for the dps_reg_ctx global variable across different threads.
    2. Understanding why the DPS service is reinitiated despite having valid Azure IoT Hub details in the settings.
    3. Addressing the apparent redundancy in the DPS provisioning process within the azure_iot_hub_connect function.

    Your expertise and deeper insights into these areas would be beneficial.

    We are looking forward to your guidance.

    Best regards,
    Vincent

  • Hi again Vincent

    Thank you for the extra information. Would it be possible for you to provide a Wireshark trace so the Azure library developers can take a look and see what's going on over the air? To do so you can connect to the nRF7002 DK to a Wi-Fi hotspot on you computer and capture a trace from that with Wireshark. Let me know if you need any further details. That way we can make sure what is going on over the air exactly, and won't have to make any guesses to what's going on here.

    Best regards,

    Simon

  • Hi Simonr,

    Thank you for your continued support. Before proceeding with the Wireshark trace, which I am fully prepared to conduct, I would like to bring to light some key observations and concerns.

    From my understanding and experience with the system, the initial communication with the Azure IoT Hub, including connecting and sending telemetry, seems to function correctly. This includes the use of the DPS for initial provisioning and connection. However, the problem becomes apparent during the reconnection phase, where the device gets reprovisioned each time due to the inability to bypass the DPS. This leads to unnecessary reprovisioning despite the device being previously provisioned and connected to the Azure IoT Hub.

    The core issue seems to be rooted more in the Nordic code handling rather than a communication problem. Specifically, the application's failure to properly recognize and utilize the saved Azure IoT Hub information gives the impression of no previous information being saved, triggering the reprovisioning process. This was detailed in my previous communication, highlighting concerns about the azure_iot_hub_dps_start function in azure_iot_hub_dps.c and the apparent issues with thread synchronization or variable caching, particularly with the dps_reg_ctx global variable.

    In light of these observations, I am keen to understand the specific objectives behind the request for a Wireshark trace:

    1. What aspects or anomalies in the network communication are we looking to identify in the Wireshark trace that could contribute to the reconnection and reprovisioning issues?
    2. How will the insights from the Wireshark trace help us address the issue of the application not recognizing the saved Azure IoT Hub information, especially considering this seems more related to the Nordic code’s handling of the process?

    Understanding the reasoning behind this request will greatly assist me in ensuring that the Wireshark trace is focused and relevant and will help us collaboratively pinpoint the exact nature of the problem more effectively.

    Looking forward to your insights and guidance on this matter.

    Best regards,
    Vincent

  • Hi again Vincent

    We were able to reproduce this on our end in the meantime, and one of our devs discovered that the host name and dev ID is not loaded correctly. This morning I was told a pull request has been made where the settings are changed on the backend. Please check it out and confirm whether it solves the issues on your end. https://github.com/nrfconnect/sdk-nrf/pull/13180/commits/e46e68c1cf4fdcf5e7a8d24df7e0b70561248c0b 

    Best regards,

    Simon

Related