This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

BLE_HCI_INSTANT_PASSED disconnection after LL_CHANNEL_MAP_REQ during flash erase

We're seeing an issue where the soft device stops responding and ultimately disconnects with the reason BLE_HCI_INSTANT_PASSED. The disconnection only seems to happen when a LL_CHANNEL_MAP_REQ message is received while the Nordic is performing a flash erase. The Wireshark capture from a sniffer shows the master is resending the LL_CHANNEL_MAP_REQ, but the slave doesn't send a response until after the instant has passed, triggering the BLE_HCI_INSTANT_PASSED disconnection.

During the time when the flash erase is happening, we've captured instances with and without the channel map update request. Without the request, the device also stops responding, but eventually recovers:

If the slave receives a LL_CHANNEL_MAP_REQ message during the erase, it misses the instant and disconnects:

We're using Soft Device S140 Version 6.1.1.

Is there any reason why the slave would stop responding during the flash erase if we're using the soft device flash API? What determines how far into the future the instant is calculated during a channel map update? Is there any other explanation for this behavior?

Thanks!

0 Edvin over 4 years ago

Hello,

What connection interval do you use?

Can you send the sniffer trace as a .pcapng file?

Flash operations can't be done at the same time as the radio is running. What kind of flash erase do you do? Are you using fstorage directly, or together with fds?

BR,
Edvin
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 rkorn over 4 years ago in reply to Edvin

Hi Edvin,

Thanks for your input. The connection interval is 15 ms. We're using fstorage and the Soft Device flash API. Unfortunately, I can't share the full .pcapng file since it contains sensitive client information.

It's my understanding that using the flash API should take care of scheduling the flash operations so that they don't interfere with the radio operations. According to the datasheet, the flash erase can halt the CPU for ~85 ms. I assumed that the Soft Device breaks this up into partial erases in order to meet radio timing requirements. However, the trace suggests that it's actually missing connection intervals while the erase is happening.

Is it expected that the soft device will miss some connection intervals in order to perform the erase? It seems like missing the connection intervals only becomes a problem when the channel map update is received and the slave needs to respond and update its channel map within 6 connection intervals (90 ms), when the instant occurs.

I wonder if the situation could be improved by increasing the connection interval slightly so that the erase (plus some margin) will finish in under 6 connection intervals. Thoughts?
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Edvin over 4 years ago in reply to rkorn

It is correct as you say, that the erase page can take up to 85ms. This can not be split up, as it is one HW operation (the minimum erase flash size is one flash page).

Typically, this is something we see when customers try to run GC (garbage collection with FDS) when they are in a connection with a fairly short connection interval.

I actually thought that the erase flash operation would not succeed, because the softdevice sees that it doesn't have time, but I am not sure whether this decision is typically made in fstorage or fds.

But absolutely. If you need to delete a flash page, increasing the connection interval is a good idea, as long as the central accepts that, of course. If you want to, you don't need to do this until you need to delete a page in flash. When the new connection interval is set, delete the page(s) that you need to delete, and you can request the normal connection parameters again.

Best regards,

Edvin
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 BretH over 4 years ago in reply to Edvin

Hi Edvin. I am working together with rkorn.

I like the idea of temporarily increasing the connection interval. Alternatively, we can schedule our fstorage erase prior to our BLE connection.

That said, I have a general question about the LL_CHANNEL_MAP_REQ or a connection parameter update request. Both of these involve scheduling the channel map update or connection parameter update at an instant in the future.

In the Wireshark screenshot provided in the original post, we can see there are 7 back-to-back retries of the LL_CHANNEL_MAP_REQ from the master. The 7th request is finally received and processed by the SoftDevice and a return packet is sent. The first LL_CHANNEL_MAP_REQ packet was in connection event 8352 with a scheduled instant of 8358. Because the flash erase takes some time and packets are missed by the softdevice, the first LL_CHANNEL_MAP_REQ received is in the connection event 8358 - the same event as the scheduled instant! The nordic responds but ultimately disconnects with the BLE_HCI_INSTANT_PASSED reason.

While this has a low probability of occurring, we are seeing it frequently enough to be a nuisance. Our case is caused by the flash erase preventing reception of the first 6 packets, but it seems this could happen if poor link conditions were met -> the peripheral fails to receive the packets prior to the instant occuring. Due to the guaranteed delivery feature of the LL, this seems to be a hole in the BLE spec, or is there a possibility of the softdevice to recover without disconnecting? It seems like the HCI_INSTANT_PASSED is inevitable. Is this true?
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 rkorn over 4 years ago in reply to Edvin

I am going to try increasing the connection interval and see what happens.

When I mentioned partial erases, I was referring to this feature of the NVM controller: https://infocenter.nordicsemi.com/index.jsp?topic=%2Fps_nrf52840%2Fnvmc.html

It seems like this feature would be particularly useful for the soft device. Do you know if it uses the partial erase?
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel