This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

NRF Connect SDK missing mesh-local UDP6 broadcasts from openthread

Hello,

I'm developing an application using zephyr, openthread, udp, and nfc, targeting the nrf52840 (currently developing on nrf52840dk pca10056).

I initially developed and tested the application using zephyr open source (not NRF Connect SDK). The application works as expected on zephyr 2.4.

I recently ported the application to the NRF Connect SDK and upon doing this, I noticed an issue with udp multicast delivery to mesh-local addresses on the NRF Connect SDK when using the Zephyr POSIX API.

The relevant code in my application is loosely similar to the code from Zephyr's 'echo-server' sample, where it opens datagram socket and calls bind(), then calls recvfrom().

Before migrating to NRF Connect SDK and using Zephyr only, all udp multicast messages sent to FF02::1 and FF03::1 were delivered to my application through bsd sockets api, however after migrating to NRF Connect SDK, the application no longer receives udp multicast that are sent to FF03::1, but only those that are sent to FF02::1. The application is configured as a thread FTD so should receive any datagrams with destination FF03::1.

From the CLI, all ping to FF02::1 and FF03::1 work normally between all devices. Using openthread's udp cli commands to bind to a port, the device receives all FF02::1 and FF03::1 broadcasts as expected. However again my application does not receive the FF03::1, only FF02::1 broadcasts when binding through the bsd sockets api. I have not tried using the openthread udp api directly in my application. I suspect that would work, given the results when using the openthread udp commands through openthread cli.

It seems the datagrams with dest FF03::1 are received through openthread networking but not through posix api to my application, indicating a possible configuration issue or other issue with hand-off from openthread to zephyr networking. Perhaps a socket option or other KConfig option I am missing that is needed for NRF Connect SDK Zephyr to receive these datagrams? I am confused because the same code works on vanilla Zephyr project, but not here.

If I erase my DK and reflash the same code using Zephyr with no NRF Connect SDK, the application behaves normally again.

What am I missing here? Any help is greatly appreciated :)

Parents
  • After doing more investigation, i found that a recent change to Zephyr causes the above change in behavior. This change is included with the current (1.4.99) NRF Connect SDK fork of Zephyr, but the Zephyr version I was building my original app against was older and did not have this change yet.

    This Zephyr commit adds code to net_ipv6_input() to filter multicast messages coming from lower layers, specifically those destined for a non all-nodes multicast address that is not joined.

    This change is made to reflect IPV6 Addressing RFC 4291 Section 2.7.1 and 2.8, which specifies that a host must respect all-nodes addresses FF01::1 and FF02::1, as well as any joined multicast group addresses. This differs from Thread which also specifies FF03::1 as a mesh-local all-nodes multicast address.

    The above change caused my issue of packet drop when using Zephyr's bsd sockets api on a mesh L2, because Zephyr doesn't respect FF03::1 as an all-nodes multicast address by default, and multicast addresses added to Zephyr through the openthread integration are not joined by default, so the packets are dropped.


    a few ways around this:

    • Use OpenThread API directly instead of bsd sockets api - Not ideal for portability but works around issue
    • Create and join an application-specific multicast group from all application nodes - Preferable solution
    • Application itself can call net_if_ipv6_maddr_join() manually to join FF03::1 and other desired thread-specific multicast addresses - works but seems improper to 'join' a reserved all-nodes address
    • Zephyr's net_ipv6_input() multicast packet filtering could be modified to exempt FF03::1 from packet drop. could be ok because zephyr has function net_ipv6_is_addr_mcast_mesh() which could be used to prevent drop of multicast packets from mesh-local all-nodes addresses, but this seems improper because IPV6 implementation shouldn't know about mesh addresses
    • Zephyr openthread bridge could be modified to call net_if_ipv6_maddr_join() on FF03::1 by default during the process of adding an openthread multicast address to Zephyr - this also works but again involves calling join on a reserved all-nodes address which feels strange.

    Any thoughts from Nordic on this issue and best way to solve it for NRF Connect SDK applications?

  • Hello,

    Thank you for the information. I ran this by our Thread team, and they say that your suggested workarounds seems reasonable. They will look into this in more details, and whether there is a reasonable way to patch this in NCS. Please note that it is not Nordic Semiconductor that writes the openthread implementation in Zephyr. Perhaps you can file a bug report there as well. I guess our Thread team will do so, but it may be pushed into an earlier release if reported from several holds. 

    Best regards,

    Edvin

  • Thank you for the reply Edvin. I am aware Nordic does not write openthread in zephyr. At the time I submitted the original ticket, I thought the issue may be tied to NCS implementation since I was seeing it was reproducible in NCS but not in Zephyr. However after further looking, I am confirming the issue is actually with Zephyr.

    Yesterday I recorded a Zephyr issue to highlight the problem. I think the fix belongs there in the openthread shim layer of Zephyr. I have opened a pull request to resolve the issue (using my last suggestion above to join any mesh-specific openthread multicast addresses), and am awaiting review from Zephyr developers. See the issue and PR for details

    Thank you

Reply
  • Thank you for the reply Edvin. I am aware Nordic does not write openthread in zephyr. At the time I submitted the original ticket, I thought the issue may be tied to NCS implementation since I was seeing it was reproducible in NCS but not in Zephyr. However after further looking, I am confirming the issue is actually with Zephyr.

    Yesterday I recorded a Zephyr issue to highlight the problem. I think the fix belongs there in the openthread shim layer of Zephyr. I have opened a pull request to resolve the issue (using my last suggestion above to join any mesh-specific openthread multicast addresses), and am awaiting review from Zephyr developers. See the issue and PR for details

    Thank you

Children
Related