This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Delay between characteristics changing and connected device getting notification of that change

Hi folks,

I have a system basically working with the following setup:

#define MIN_CONN_INTERVAL               MSEC_TO_UNITS(7.5, UNIT_1_25_MS)      
#define MAX_CONN_INTERVAL               MSEC_TO_UNITS(7.5, UNIT_1_25_MS)       
#define SLAVE_LATENCY                   0      

I have a custom service with four characteristics that allow data to get from Nordic to PC (running Linux, not Windows or OSX) with handshaking in the following way:

PC Puts data in char1

Nordic gets new data and then:

Nordic writes value to char2 which tells the PC the Nordic has got the data and has enough space in its buffer for the PC to send more data. PC then pus updated version in char1 and so on...

The other two characteristics work for data being sent the other way.

Now, this works fine apart from the fact that there's a big old delay between data being put in a characteristic and the other end getting notification of that change. One problem is I don't know whether this is a delay between the Nordic picking up what the PC has changed or the PC picking up what the Nordic has changed.

I have put measurements at the PC end so I can tell that from a characteristic changed by the Nordic to tell the PC it is ready for more data it takes 140ms between updates. ie it is a round trip of 140ms, which is far too long!

Nordic writes to char2, PC updates char 1. Between the PC picking up that change in char2 and updating char1 there is less than a millisecond.

From the Nordic picking up the change in  char1 and putting a new value in char2 to twell the PC it can update char1 there is less than a millisecond.

So, there is either:

A delay in the soft device updating the characteristic after its API is called or:

A delay in the PC picking up the change in the characteristic or:

A delay in the PC from the application writing to the characteristic and it appearing "on air" or:

A delay in the soft device from picking up the change in char1 to sending it to my software running on the Nordic.

Or some combination of these. 

Are there any settings I have not tweaked that might be causing this? If so, what migt they be?

Can anyone think of anything I can do to track down exactly where the problem might be please? If I could toggle a pin on the PC when it picks up a change in char2 that might help I guess, but the days of parallel ports are long gone! ;)

Many thanks.

Parents
  • Hi,

    A sniffer trace should reveal quite clearly what is going on over-the-air, and by that reveal where you get the delays. That should narrow it down enough that you can investigate the device and piece of code responsible for the delay.

    Regards,
    Terje

  • Hi,

    Many thanks for the reply. How does this know, though, when a characterstic change reaches various stages

    So: PC changes charactersitc. Goes out on to the air. How can this know the delay between Qt program on computer writing to the chatacteristic and that change appearing on the air?

    Then how does it know how long it has taken the on air change to hit my custom hardware with the nRF on it? In terms of the hardware detecting the change, the soft machine detecting the change and it finally bubbling through to my custom app on the nRF?

    Similarly going the other way, my app writes to the soft device; how on earth does the sniffer know when this has happened and what the delay is between that charcractersitc having been changed and it appearing on air, let alone how we time between writing to the soft device API and the change being detected on the PC?

    I must be missing something fundamental here! Thanks. :)

  • Hi,

    The sniffer can only tell when messages are sent, yes.

    But if the behavior of the devices is to respond as quickly as possible to GATT writes and/or notifications, then you can measure the time between a and b, and between b and a, in your description.

    Then you know the delay from a write is sent over the air, until the notification (with a new value) is sent back, and so you can tell how quickly the nRF does its job. E.g. you know the total time for the nRF to register the value, find the new value, and respond with a notification.

    Similarly, you will see the total time for the PC to process an incoming notification until the write has been sent over-the-air.

    A diagram of what is happening would be great for clearing up misunderstandings, yes!

    Regards,
    Terje

  • Sorry for the delay, been on other stuff.

    OK, this first one hopefully shows what I mean. The Qt app (top left) posts an update to a characteristic. The sniffer can then see when that comes on air. However, the sniffer has no chance (as far as I can tell) of knowing when the soft device posts that value to my app or, indeed, when the soft device even first becomes aware of that characteristic change.

    Similarly the other way. My app on the nRF posts a change in the return characteristic, the sniffer can see that on air, but how does it know when the laptop running the Qt app has notification?

    btmon on the laptop might well have the same kind of info as the sniffer.

    I've tried a crude timeline above too to try and make the point.

    I know that the nRF app turns around quickly from getting a notification from the soft device, to posting a change in the "handshake" characteristic annd I know at the Qt end there is a similar fast turn around once the handshake characteristic change is detected and the Qt app posts the next packet of data.

    So, am I missing something? If not, are there any common gotchas for this kind of thing?

    Thanks! Slight smile

  • Hi,

    To reiterate your notes from your first image:

    1. Laptop posts data (sniffer can timestamp)
    2. nRF receives change notification (sniffer cannot timestamp)
    3. nRF posts data (sniffer can timestamp)
    4. laptop receives change notification (sniffer cannot timestamp)

    I was under the impression that:

    • on the nRF side, 2. triggers 3., and this should happen as quickly as possible
    • on the laptop side, 4. triggers 1., and this should happen as quickly as possible
    • the round-trip, e.g. 1-2-3-4, or 3-4-1-2, is very long, but we do not know where it takes so long (e.g. is it the laptop or the nRF?)

    With sniffer timestamps we can measure the time spent on either side from rx to tx on that side.

    In the below drawing, I have drawn the transmissions (visible to the sniffer) in green, and marked various time intervals. It is a bit simplified on the laptop side, since you probably have multiple layers in the stack there, but in any case there will be a delay between over-the-air traffic and logging (no matter how "far down in the stack" the logging happens.)

    The main point is: If we rely on software running on one of the sides, we do not know what portion of time is spent "between rx and tx on the other side", and what portion of time is spent "by radio software on the local side". With the sniffer log we know for sure what the "reaction time" is for each device.

    In addition to that, useful things in the sniffer log that I have not mentioned yet:

    • if link layer packets are resent
    • in the case of writes, if there is long writes (a value written to a characteristic is so long it gets sent with several packets)
    • what connection parameters are negotiated
    • +++ (basically we get an authoritative source for everything going on over-the-air.)

    Regards,
    Terje

  • Hi Terje,

    Sorry for the delay, holidays and getting dragged off to other things has delayed this.

    I suspect I am still missing something fundamental here. Sorry if I'm repeating myself and not explaining myself well here, but I just don't understand how this helps in seeing what goes on in the software on our custom hardware that has the Nordic device on it.

    How does the sniffer get into our custom hardware/software to see where these delays are happening? Do we have to have some sort of UART connection from our custom hardware to the sniffer? If not. the sniffer, surely, can simply see what's happening on the airwaves and not if our Nordic code is resending or putting in delays?

    What am I not understanding here?

    Thanks again and sorry if I'm being slow.

  • Hi,

    It is in order to see which side is responsible for the delays. Further debugging is next step.

    Regards,
    Terje

Reply Children
  • Hi Terje,

    I spent a couple of days installing the sniffer software (what a job that is!) and learning how to use Wireshark in this context, which is not easy either! :)

    I have used Wireshark on Ethernet several times and find it reasonably intuitive. However, I am not understanding this at all at the moment.

    So, if I post a characteristic on the laptop of 240 bytes, I can see that Wireshark is saying that is broken down into 17 (seventeen) discrete packets

    I have attached a pcap file. The first four packets seem right and are NOT part of the 240 byte characteristic. The first two constitute an arrow down command from the computer to the nrf and a response from the nrf to the computer to say got it, I'm ready for the next message,

    The next two packets are erase flash and get ready for a new firmware image, plus the ack to say instruction received. Packet 3 from computer to nrf, packet four from nrf to computer.

    The next seventeen is supposed to be one 240 byte characteristic with the first  chunk of a large binary file.

    I set a breakpoint in the computer's Qt code that is sending the new firmware image from the computer to the nrf at the point the computer gets the ACK from the nrf to say, please send me the next chunk,

    There are a whole host of questions with this:

    1. Why is the 240 byte characteristic not being seen as one entry in the pcap? It looks like it is being broken down into multiple packets. Is there a problem with how I'm sending out packets or is this an artefact/limitation of the sniffer? It looks to me like everything is coming in to Wireshark in 27 byte chunks or fewer. What is REALLY odd is that packet 5, although the chunk has 27 bytes, the first seven are not what I am sending out. This is, I suppose, likely to be somethign about LE I don't understand.

    The last four bytes of the 240 byte characteristic that is written, is  in packet  number 14 of length 30 - hex values of 21, 1d, 01, 08 whereas the chunks 5-13 are 20-27 byte

    2. Is the sniffer only capable of receiving 27 byte characteristics? This on the one hand seems likely given the 27 byte chinks of data in packets 5-13. On the other hand, that seems unlikely as if the sniffer could only handle 27 byte characteristics it would freak out  with a 240 byte example and just do nothing. I wondered whether the sniffer could only send certain sizes of data to Wireshark, but that's clearly not the case as packets 15-22 show.

    3. And then we come to 15-22. What are they? Why are they so much bigger, why are there so many CRC errors?

    4. Why am I not - with the sniffer - seeing a reply from the nrf to the computer to say ACK, send me the next packet? When I send single frame messages I see that ACK.

    I have attached a filtered pcap of what I am talking about here.

    TBH I am not clear how this really helps me still, but that's a question for when we've sussed ths out. Maybe this whole thing of characteristics not really being 240 bytes long is the crux of this.

    Many thanks for your help and patience, it is greatly appreciated.

    First Chunk Filtered.pcapng

  • Hi,

    Nice, thank you for the sniffer log. I think it points us neatly towards what is the issue.

    1. The L2CAP fragmentation that you see is not a limitation of the sniffer. What we see is how the packets are actually being sent.

    2. Yes, the existence of longer packets shows that the sniffer is capable of receiving longer packets.

    3. Good questions. Is there any change of MTU size, or PDU, or other activity in-between there that can explain a change?

    4. There should be communication both ways, in order to ACK from the Slave, as you mention.

    Indeed, I think looking into the fragmentation will lead us towards a solution here...

    We have seen similar behavior before, in that some packets are split in two that one would expect not to have been fragmented at all. The reason has to do with L2CAP and Link Layer frames being stored in ring buffers sometimes wrap around and thus lie partly at the end and partly at the beginning of the ring buffer. Because of contiguous memory requirements this splits them into two, for all intents and purposes.

    While the previously seen issue may explain part of the fragmentation (the "final" division into two fragments), it does not explain why the MTU gets split into more than two fragments. For looking into that, we need more of the sniffer log, to see what MTU, PDU, etc. are negotiated.

    For the case related to the previously seen issue the data transfer went from Slave to Master and there were two-way communication in the Sniffer logs. If I am not mistaken, there should be two-way communication in your case as well, at least at some point, for the Slave to ACK. Is there any other BLE activity from Slave to Master in-between the filtered sniffer output?

    Full sniffer log would be really helpful. If you do not want to share it openly here in a public forum thread, let me know and I can change this into a private ticket.

    Regards,
    Terje

  • Hi again,

    It's taken a fair bit of time this morning to get this going. The sniffer and/or Wireshark is very flaky. I have to log out and back in again if I want to start a new sniff and the nrf dev board needs to be powered off and on.

    Anyway, here is a full log, but it's even weirder than before. The first transfer starts at packet 10, this is 27 bytes long. We then get a reply from the nrf which is bizarre. Wireshark thinks it's empty, but it's not, it has two bytes in its payload which is correct.

    We then get a whole massive series of 27 byte transfers. The file being transferred starts on packet 12 with FC 9F and ends with rhe 1E 01. The next bytes in the file being sent are 08 1A D5 etc exactly as is found in packet 14.

    The last part of the file is in packet 26592 ending in 10 C6 BA.

    Everything is clearly getting sent correctly as the new file is accepted, reprogrammed in the target and rebooted.

    I'm going to put the debugger on the nrf and double check what size packets it is reading. It certainly *used* to be 240 bytes, so I con't see why it should have changed.

    As far as I underatand, the master (Qt program) listens to the slave (nrf) and decides size of transfer on the size of the characteristic, so the Qt program shouldn't, AFAIK, decide to on;y sent 27 byte chunks.

    Will be back with a little more info on what the nrf receives shortly!

    small packet sizes.pcapng

  • I can confirm that when I get a read of the 240 byte characteristic, my printf in SES tells me that I receive 240 characters each and every time.

  • One further thing to report. If I use a different laptop which has a BT4.2 module it works really well, really fast data transfer as I'd expect. Do you know of any reason why there should be an issue with 4.2 and not 5.0?

Related