This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

BLE device local name parsing

I have a BLE device for which the local name contains and apostrophe. When I run nRF Connect on an Android phone and view the raw data for this device I find the apostrophe is represented by the following three hex bytes: 0x80 0x99 0x73. I am developing an App that reads the local name and I find the same 3 bytes for this device at the location of the apostrophe in the scan report data. I am curious as to how you came to interpret these 3 bytes as an apostrophe when that is usually represented by the single Ascii byte 0x27?

Parents
  • Hello,

    That sounds weird, and would be something I too would be curious about.
    Could you please provide me a more exact description of the issue - with a screenshot of the "raw data for this device" and the scan report data, along with the device name in question?
    Initially I would think that this is the result of the inputted apostrophe not actually being a regular apostrophe - but rather a very similar looking non-ascii character. Could this at all be possible?

    I am looking forward to getting to the bottom of this,

    Best regards,
    Karl

Reply
  • Hello,

    That sounds weird, and would be something I too would be curious about.
    Could you please provide me a more exact description of the issue - with a screenshot of the "raw data for this device" and the scan report data, along with the device name in question?
    Initially I would think that this is the result of the inputted apostrophe not actually being a regular apostrophe - but rather a very similar looking non-ascii character. Could this at all be possible?

    I am looking forward to getting to the bottom of this,

    Best regards,
    Karl

Children
  • As you can see from the nRF Connect images above there is a device "Maggie's Room" for which the Raw Data in the first line has the hex bytes "0xe2 0x80 0x99" in between the 'e' (ox65) and the 's' (0x73) where the apostrophe is. The hex value for an Ascii apostrophe is 0x27 and is only one byte. In my code I retrieve the same raw data but the apostrophe gets displayed as three extended Ascii characters that look like garbage in the output. So my question is how do those three bytes get interpreted and properly displayed as an apostrophe in your nRF Connect code?

  • Hello,

    I am unable to reproduce this on my end - when inputting a ' it is written as 0x27 in the raw data, as can be seen in the attached screendump. I tested this with nRF Connect as well, with the same result. Therefore, I strongly suspect that the inputted character is in fact not an apostrophe.


    Upon further investigation, your inputted character - 0xE2 0x80 0x99 - is in fact a UTF-8 right single quotation mark .
    This explains why it is rendered similar to an apostrophe, while it is not.

    If you run into similar problems in the future, I recommend looking them up in a unicode lookup table for the fastest possible resolve.

    Let me know if you should have any other questions! :)

    Best regards,
    Karl

  • Thanks so much for your reply. I never considered that the advertised name might be encoded as other than Ascii. Do you know if UTF-8 is commonly used in Bluetooth string data encoding?

  • No problem at all, I am happy to help!

    The general case is purely ascii, it is only when the input contains non-ascii characters that other encodings have to be utilized.
    In your case, you can avoid this by sticking to only ascii characters - replacing the right single quotation mark with a 0x27 ' apostrophe.
    Perhaps your keyboard is configured to a national layout that contains more non-ascii characters? That could be something to look into.
    I would recommend sticking to only ascii characters if possible for minimal byte usage and simplicity.

    Best regards,
    Karl

Related