This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

NRF Connect SDK missing mesh-local UDP6 broadcasts from openthread

Hello,

I'm developing an application using zephyr, openthread, udp, and nfc, targeting the nrf52840 (currently developing on nrf52840dk pca10056).

I initially developed and tested the application using zephyr open source (not NRF Connect SDK). The application works as expected on zephyr 2.4.

I recently ported the application to the NRF Connect SDK and upon doing this, I noticed an issue with udp multicast delivery to mesh-local addresses on the NRF Connect SDK when using the Zephyr POSIX API.

The relevant code in my application is loosely similar to the code from Zephyr's 'echo-server' sample, where it opens datagram socket and calls bind(), then calls recvfrom().

Before migrating to NRF Connect SDK and using Zephyr only, all udp multicast messages sent to FF02::1 and FF03::1 were delivered to my application through bsd sockets api, however after migrating to NRF Connect SDK, the application no longer receives udp multicast that are sent to FF03::1, but only those that are sent to FF02::1. The application is configured as a thread FTD so should receive any datagrams with destination FF03::1.

From the CLI, all ping to FF02::1 and FF03::1 work normally between all devices. Using openthread's udp cli commands to bind to a port, the device receives all FF02::1 and FF03::1 broadcasts as expected. However again my application does not receive the FF03::1, only FF02::1 broadcasts when binding through the bsd sockets api. I have not tried using the openthread udp api directly in my application. I suspect that would work, given the results when using the openthread udp commands through openthread cli.

It seems the datagrams with dest FF03::1 are received through openthread networking but not through posix api to my application, indicating a possible configuration issue or other issue with hand-off from openthread to zephyr networking. Perhaps a socket option or other KConfig option I am missing that is needed for NRF Connect SDK Zephyr to receive these datagrams? I am confused because the same code works on vanilla Zephyr project, but not here.

If I erase my DK and reflash the same code using Zephyr with no NRF Connect SDK, the application behaves normally again.

What am I missing here? Any help is greatly appreciated :)

Parents
  • After doing more investigation, i found that a recent change to Zephyr causes the above change in behavior. This change is included with the current (1.4.99) NRF Connect SDK fork of Zephyr, but the Zephyr version I was building my original app against was older and did not have this change yet.

    This Zephyr commit adds code to net_ipv6_input() to filter multicast messages coming from lower layers, specifically those destined for a non all-nodes multicast address that is not joined.

    This change is made to reflect IPV6 Addressing RFC 4291 Section 2.7.1 and 2.8, which specifies that a host must respect all-nodes addresses FF01::1 and FF02::1, as well as any joined multicast group addresses. This differs from Thread which also specifies FF03::1 as a mesh-local all-nodes multicast address.

    The above change caused my issue of packet drop when using Zephyr's bsd sockets api on a mesh L2, because Zephyr doesn't respect FF03::1 as an all-nodes multicast address by default, and multicast addresses added to Zephyr through the openthread integration are not joined by default, so the packets are dropped.


    a few ways around this:

    • Use OpenThread API directly instead of bsd sockets api - Not ideal for portability but works around issue
    • Create and join an application-specific multicast group from all application nodes - Preferable solution
    • Application itself can call net_if_ipv6_maddr_join() manually to join FF03::1 and other desired thread-specific multicast addresses - works but seems improper to 'join' a reserved all-nodes address
    • Zephyr's net_ipv6_input() multicast packet filtering could be modified to exempt FF03::1 from packet drop. could be ok because zephyr has function net_ipv6_is_addr_mcast_mesh() which could be used to prevent drop of multicast packets from mesh-local all-nodes addresses, but this seems improper because IPV6 implementation shouldn't know about mesh addresses
    • Zephyr openthread bridge could be modified to call net_if_ipv6_maddr_join() on FF03::1 by default during the process of adding an openthread multicast address to Zephyr - this also works but again involves calling join on a reserved all-nodes address which feels strange.

    Any thoughts from Nordic on this issue and best way to solve it for NRF Connect SDK applications?

Reply
  • After doing more investigation, i found that a recent change to Zephyr causes the above change in behavior. This change is included with the current (1.4.99) NRF Connect SDK fork of Zephyr, but the Zephyr version I was building my original app against was older and did not have this change yet.

    This Zephyr commit adds code to net_ipv6_input() to filter multicast messages coming from lower layers, specifically those destined for a non all-nodes multicast address that is not joined.

    This change is made to reflect IPV6 Addressing RFC 4291 Section 2.7.1 and 2.8, which specifies that a host must respect all-nodes addresses FF01::1 and FF02::1, as well as any joined multicast group addresses. This differs from Thread which also specifies FF03::1 as a mesh-local all-nodes multicast address.

    The above change caused my issue of packet drop when using Zephyr's bsd sockets api on a mesh L2, because Zephyr doesn't respect FF03::1 as an all-nodes multicast address by default, and multicast addresses added to Zephyr through the openthread integration are not joined by default, so the packets are dropped.


    a few ways around this:

    • Use OpenThread API directly instead of bsd sockets api - Not ideal for portability but works around issue
    • Create and join an application-specific multicast group from all application nodes - Preferable solution
    • Application itself can call net_if_ipv6_maddr_join() manually to join FF03::1 and other desired thread-specific multicast addresses - works but seems improper to 'join' a reserved all-nodes address
    • Zephyr's net_ipv6_input() multicast packet filtering could be modified to exempt FF03::1 from packet drop. could be ok because zephyr has function net_ipv6_is_addr_mcast_mesh() which could be used to prevent drop of multicast packets from mesh-local all-nodes addresses, but this seems improper because IPV6 implementation shouldn't know about mesh addresses
    • Zephyr openthread bridge could be modified to call net_if_ipv6_maddr_join() on FF03::1 by default during the process of adding an openthread multicast address to Zephyr - this also works but again involves calling join on a reserved all-nodes address which feels strange.

    Any thoughts from Nordic on this issue and best way to solve it for NRF Connect SDK applications?

Children
  • Hello,

    Thank you for the information. I ran this by our Thread team, and they say that your suggested workarounds seems reasonable. They will look into this in more details, and whether there is a reasonable way to patch this in NCS. Please note that it is not Nordic Semiconductor that writes the openthread implementation in Zephyr. Perhaps you can file a bug report there as well. I guess our Thread team will do so, but it may be pushed into an earlier release if reported from several holds. 

    Best regards,

    Edvin

  • Thank you for the reply Edvin. I am aware Nordic does not write openthread in zephyr. At the time I submitted the original ticket, I thought the issue may be tied to NCS implementation since I was seeing it was reproducible in NCS but not in Zephyr. However after further looking, I am confirming the issue is actually with Zephyr.

    Yesterday I recorded a Zephyr issue to highlight the problem. I think the fix belongs there in the openthread shim layer of Zephyr. I have opened a pull request to resolve the issue (using my last suggestion above to join any mesh-specific openthread multicast addresses), and am awaiting review from Zephyr developers. See the issue and PR for details

    Thank you

  • Thank you as well! I noted the Thread team about your Zephyr issue and pull request. They will look into this from Monday, and figure out how to make sure that either this zephyr patch or a workaround is implemented in our future releases. 

    Thank you for bringing this bug up.

    Best regards Edvin

Related