Beware that this post is related to an SDK in maintenance mode
More Info: Consider nRF Connect SDK for new designs
This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

NRF_LOG_HEXDUMP_* raises Data Access Violation Fault if a large buffer is used

Hi to all,

I have found that using SDK17.0.0 on NRF52832, whenever I try to use NRF_LOG_HEXDUMP_DEBUG for a large buffer (more than a 1000 elements), I get a Data Access Violation Fault.

I understand that since the NRF_LOG is deferred and my buffer is allocated in stack, I can get the buffer pointer to be corrupted. 
Aware of that, after calling the NRF_LOG_HEXDUMP_DEBUG I call the NRF_LOG_FINAL_FLUSH, in order to fully flush the logs.
I have also changed the NRF_LOG_BUFSIZE to 4096, in order to fully store my buffer (1800 bytes)

I have also noticed that the output of my calls to NRF_LOG_HEXDUMP_DEBUG are capped at 80 bytes (8 columns x 10 rows), even if the buffer contains more data and and the length argument of the NRF_LOG_HEXDUMP_DEBUG call is more than 80 bytes.

Below is a code snippet. I have tried to dump the buffer immediately after creation (even if it is all zeroes), just to remove the possibility of errors in functions that manipulate the buffer contents. Still getting Data Access Violation Fault.

uint8_t cert_buffer[1800] = {0};
NRF_LOG_HEXDUMP_INFO(cert_buffer, 1800);
NRF_LOG_FINAL_FLUSH();

Am I missing something or is this a bug? Can anyone reproduce this behavior?

Thanks in advance,

Pedro

Parents
  • Hi

    Each byte of RAW data is converted to an ASCII hex presentation, so essentially each byte of raw data becomes multiple bytes of log data. 

    Does it work if you try to send a shorter dump, such as 500 bytes?

    Best regards
    Torbjørn

  • Hi ,

    I took the conversion from hex to ASCII in consideration. Since I couldn't pinpoint if that conversion was done when storing or printing the data, I made the NRF_LOG buffer bigger than required. From hex to ASCII, you double the amount of bytes required to store a character, so 2*1800 < 4096, which means that the full buffer should fit inside NRF_LOG buffer.

    Note that the max amount of data printed is always capped at 80 bytes of hexadecimal data.

    Does it work if you try to send a shorter dump, such as 500 bytes?


    It doesn't raises a Data Access Violation Fault, but data is still capped at 80 hex bytes

  • Thank you for the quick response, .

    Please ping me when you have news.

    Best regards,

    Pedro

  • Hi Pedro

    He did some experiments trying to increase the NRF_LOG_MSGPOOL_ELEMENT_COUNT parameter in sdk_config.h. 

    First he tested with 40 which seemed to work fine, but increasing it much beyond this didn't seem to provide stable results. 

    In order to print 1800 bytes you would need to set this parameter to 90, but this doesn't seem to work properly, there is still some other limit somewhere preventing all the bytes from being written. 

    Could you try to change this value as well and see how it affect the output?

    Best regards
    Torbjørn

  • Hi @ovrebekk,

    Some updates from my side.
    I have created a simple project which uses the sdk_configs.h I shared. It has a main loop that only dumps the buffer. The buffer is still being declared inside main, so, its allocated on the stack, as in the previous tests.

    Could you try to change this value as well and see how it affect the output?

    I have tried with different values of NRF_LOG_MSGPOOL_ELEMENT_COUNT. Below is a summary

    NRF_LOG_MSGPOOL_ELEMENT_COUNT Bytes printed Error received
    8 80 Non-Maskable Interrupt
    16 104 Non-Maskable Interrupt.
    32 208 Data Access Violation escalates to Hard Fault
    40 280 Data Access Violation escalates to Hard Fault
    64 384 Data Access Violation escalates to Hard Fault
    90 512 Data Access Violation escalates to Hard Fault
    128 464 Data Access Violation escalates to Hard Fault

    A screenshot of the call stack in one of the cases:

    Also, during my tests, I noticed that the Data Access Violation occurs because, somehow, the flushing of the logs attempts to access addresses that are impossible, like 0x616E696D (value when NRF_LOG_MSGPOOL_ELEMENT_COUNT = 128).



    In order to print 1800 bytes you would need to set this parameter to 90, but this doesn't seem to work properly, there is still some other limit somewhere preventing all the bytes from being written

    I couldn't duplicate. My tests always cause some erroneous condition that prevents me from printing the buffer.

    Can you please analyze the data above? I think I narrowed it down to the flushing of the logs, but I am not sure yet if the problem resides on the NRF_LOG module or the backend module the NRF_LOG uses to flush the logs...

  • Hi 

    The developer found a workaround by writing the data in smaller chunks, and flushing the buffers in between. 

    He used the following configuration:

    NRF_LOG_MSGPOOL_ELEMENT_SIZE 20
    NRF_LOG_MSGPOOL_ELEMENT_COUNT 110
    NRF_LOG_BUFSIZE 4096

    and then he wrote the buffer like this:

    uint8_t cert_buffer[1800] = {0};                                                                                                   
    for (int i = 0; i < 1800; i += 600)                                                                                                
    {                                                                                                                                  
        NRF_LOG_HEXDUMP_INFO(cert_buffer + i, 600);                                                                                    
        NRF_LOG_FLUSH();                                                                                                               
    }

    Can you give this a go and see if it works better?

    It's also important to note that the NRF_LOG_FINAL_FLUSH() function should not be used in normal operation. This one is only meant to be used inside error handlers (running in interrupt context). Please use NRF_LOG_FLUSH() instead. 

    Best regards
    Torbjørn

  • Hi ,

    Can you give this a go and see if it works better?

    It works better. I am able to print 135 lines, with 8 bytes each. So I get 1080 bytes from the 1800 bytes.


    Please note one thing. I am aware that I can create workarounds to fully print the entire certificate (1800 bytes), by printing it in steps or by dumping the variable contents to file through a debugger.


    On this issue I would like the discussion not be focused on getting workaround. Instead, I would prefer to address these two things:
    1. Why NRF_LOG_HEXDUMP_* prints less bytes than requested and does not output any error messages
    2. Why, when an error message is received, is a hardly recoverable Data Access Violation. Are there no verification steps inside NRF_LOG_HEXDUMP_* ?


    To my awareness, there are also no documented size limitations of the NRF_LOG_HEXDUMP_*.

    Can you please address the two topics above? If you need more debug data/help from my side, please advice.

Reply
  • Hi ,

    Can you give this a go and see if it works better?

    It works better. I am able to print 135 lines, with 8 bytes each. So I get 1080 bytes from the 1800 bytes.


    Please note one thing. I am aware that I can create workarounds to fully print the entire certificate (1800 bytes), by printing it in steps or by dumping the variable contents to file through a debugger.


    On this issue I would like the discussion not be focused on getting workaround. Instead, I would prefer to address these two things:
    1. Why NRF_LOG_HEXDUMP_* prints less bytes than requested and does not output any error messages
    2. Why, when an error message is received, is a hardly recoverable Data Access Violation. Are there no verification steps inside NRF_LOG_HEXDUMP_* ?


    To my awareness, there are also no documented size limitations of the NRF_LOG_HEXDUMP_*.

    Can you please address the two topics above? If you need more debug data/help from my side, please advice.

Children
  • Hi

    I agree it would be nice to know the reason for these issues, and what exactly sets the limits. 

    I tried to do some testing on my own and found I was able to print up to 800 bytes in one chunk, but 1024 bytes or more would fail. 

    I need some more time to look properly into this though. I need some assistance from the original developer of the logging libraries, and he is not available until next week. 

    Best regards
    Torbjørn

  • Hi ,

    Sure, take your time to fully analyse this issue.
    As you pointed above, there are workarounds that work just fine, which I am using on the meantime.

    Once you have any news and/or require inputs from my side, just let me know. Happy to help.


    Best regards,
    Pedro

  • Thanks for the understanding Pedro, you will hear from me early next week then Slight smile

  • Hi Pedro

    The developer looked into this some more, and found the issue. 

    Essentially the 32-bit header added to all hexdump messages is implemented as follows:

    typedef struct
    {
        uint32_t type       : 2;
        uint32_t in_progress: 1;
        uint32_t severity   : 3;
        uint32_t offset     : 10;
        uint32_t reserved   : 6;
        uint32_t len        : 10;
    } nrf_log_hexdump_header_t;
    

    The critical fields are offset and len, which are 10-bit values, effectively limiting a HEX dump to 1023 bytes in length. 

    In order to increase the limit to 4095 bytes you can change the struct to this:

    typedef struct
    {
        uint32_t type       : 2;
        uint32_t in_progress: 1;
        uint32_t severity   : 3;
        uint32_t offset     : 12;
        uint32_t reserved   : 2;
        uint32_t len        : 12;
    } nrf_log_hexdump_header_t;
    

    I have asked the team if we can document this limitation, and add a length check on the HEXDUMP API, but I can't guarantee if or when this will be implemented. 

    For now I hope the workaround above will suffice?

    Best regards
    Torbjørn

  • Hi, ,

    Thank you for your's and the developer's time. The NRF_LOG_HEXDUMP_* limitations make total sense and I can confirm this solution works.
    I appreciate if you add the outcomes of this thread to the documentation as well. Could help other developers.

    Before marking this as solved, can you and the developer confirm that "shaving off" 4 bits of the reserved field has no implications?

    I searched through the documentation and SDK files and could not find the usage of reserved field. Is this field's purpose only completing the 32 bit nrf_log_hexdump_header_t bitflags?

    I am specifically concerned about two things:

    1. Bitmasks not addressing the correct bits and silently do things that are not supposed to do
    2. The reserved field is not working as expected

    Can you please confirm it's ok to change the length of the reserved field?

    Best regards,
    Pedro

Related