This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Runtime hangs at an unknown location when an array is increased to a certain size

Using nRF52840 DK with SoftDevice s140 latest version.

I have a data array that I am sending over a characteristic in fragments via notifications. Given the peer platform, each fragment has to fit in the minimum MTU so is 20 bytes. The bulk of the data array is a waveform. When the waveform contains 800 bytes (plus about 48 bytes overhead) the data array is successfully fragmented and sent. When I increase the data array to 1000 bytes (the calloc call for the data does not fail so I assume there is memory enough in the heap to handle it). However, when it comes time to send it, the application just hangs. I don't know where and don't understand why.  The only thing I can think of is that there is some overwrite or there is a memory shortage and the calloc is not telling me.

Note that I am not using the SDK but just SoftDevice and I am handling the events in the main() for loop.

In debug mode I do get a "SOFTDEVICE: ASSERTION FAILED" from a case statement NRF_FAULT_ID_SD_ASSERT.  But I do not understand the meaning of it. Documenation is, as usual, useless. No text at all for what the possible causes of these assertions are.

Parents
  • Hi 

    How exactly are you fragmenting and sending the message?

    Are you basically just looping over the entire array and uploading 20 bytes at a time to the stack?

    How are you handling the NRF_ERROR_RESOURCES error when the stack buffers eventually fill up?

    Have you tried to use a static 1000 byte array, just to verify 100% that dynamic memory allocation has nothing to do with it?

    When the SoftDevice asserts there should be a program counter included, providing a pointer to exactly which part of the SoftDevice caused the assert. I can check this number with the team to figure out what exactly went wrong. 

    As a side question, any reason why you are not using data length extension to allow larger Bluetooth packets?
    This should make the data transfer of larger chunks much more efficient. 

    Best regards
    Torbjørn

Reply
  • Hi 

    How exactly are you fragmenting and sending the message?

    Are you basically just looping over the entire array and uploading 20 bytes at a time to the stack?

    How are you handling the NRF_ERROR_RESOURCES error when the stack buffers eventually fill up?

    Have you tried to use a static 1000 byte array, just to verify 100% that dynamic memory allocation has nothing to do with it?

    When the SoftDevice asserts there should be a program counter included, providing a pointer to exactly which part of the SoftDevice caused the assert. I can check this number with the team to figure out what exactly went wrong. 

    As a side question, any reason why you are not using data length extension to allow larger Bluetooth packets?
    This should make the data transfer of larger chunks much more efficient. 

    Best regards
    Torbjørn

Children
  • Thanks for your quick response.

    Fragmenting. When doing notifications I notify a hunk and check the response. If not out of tx buffers I check to see if I have more to send. If so, I notify the next hunk. When I am out of tx buffers, I break the loop and enter the main for(;;) loop. There I have the sd_ble_evt_get() event handler code where I dispatch received events to my BLE event handler. The bottom line is that I wait for the BLE_GATTS_EVT_HVN_TX_COMPLETE event so I can continue sending. There are, of course, many fragments when the max packet is 20 bytes, but it works for data arrays of at least 948 bytes. (The problem happens when the array is 1048, but I have not found the 'break' point.)

    I have not tried a static array as that would be quite difficult given the design. I am basically implementing a proposed generic health device BTLE standard where one standard will handle all health devices. So the code is designed to be configurable based upon what measurements you wish to have and the value type (numeric, compound, coded, bits, sample array). The data arrays are generated and indices calculated so one can call simple methods to update a value from a sensor. My failing case is a spirometer when the number of samples (2 bytes per sample) of flow data is 500. At 450 it works. The arrays are all allocated and freed dynamically. I have tested that part of the code in Visual Studio since it does not rely on any Bluetooth; its all done in raw C that will work on any platform. Nevertheless, I could test with a static array but I will do that as a last resort.

    I have made the arrays big enough so that the memory allocation fails when running on the DK, so I am assuming that all my callocs and frees are functioning as they should on the DK (In Visual Studio I can do memory leak detection and I have at least got that taken care of there (the code is the same in both the Visual Studio and Noridic SES projects).

    I do take advantage of the extension, but the peer has to support it. The Android I am using is kind of old (OS 6) and does not support that feature. I do have a newer Android (OS 11) that does support it and it might be interesting to try it there and see if I have the same problem. 

    I can run it again in debug mode and check program counter in the register view. I have no source code for SoftDevice so I cannot check and see what code is generating that assertion.

    I should add that I am not an embedded programming expert by any means ... I am diving into this just to try and validate the standard design, so it would not surprise me if I am doing something stupid. Just figuring out how to do the fragmentation and handling of the events via SoftDevice was a big challenge for a noob like me.

    The goal is to make the standard simple to implement, and the more one can hide in simple library methods, the more likely it will be adopted. We gateway designers are sick of having to write new code for every device type. This would solve that problem. I have one Android central gateway which I wrote once and it works for every device I dream up with no code change.

    I have nothing against sending you the project if that will help.

    I did the debug thing and at the hang the program counter was 

    0x00000a60

    At the point where I get the assertion displayed the pc is 0x00028f8c but that is likely not relevant as I am no longer in SoftDevice code but in the error handler.

  • It looks like its an internal log function. I got this error when running with an Android 11

    <error> app: ERROR 3735928559 [Unknown error code] at E:\projects\utech\nRF5_SDK_17.0.2_d674dde\components\libraries\log\src\nrf_log_frontend.c:388
    PC at: 0x0002FD27
    <error> app: End of error report

    happening here

        switch (header.base.generic.type)
        {
            case HEADER_TYPE_HEXDUMP:
                dropped = header.dropped;
                rd_idx += CEIL_DIV(header.base.hexdump.len, sizeof(uint32_t));
                break;
            case HEADER_TYPE_STD:
                dropped = header.dropped;
                rd_idx += header.base.std.nargs;
                break;
            default:
                ASSERT(false);
                break;
        }

    I turned off logging and the app worked with the 500 sample size, on both Android  OS 6 and OS 11.

    Now what? No logging?

    I found the offending log call:

    NRF_LOG_RAW_HEXDUMP_INFO(global_send.data, global_send.data_length);

    I guess the array is too big.

  • Hi 

    That explains it. I recently had a similar case, where someone was struggling to get longer hexdumps with the LOG API:
    https://devzone.nordicsemi.com/f/nordic-q-a/66041/nrf_log_hexdump_-raises-data-access-violation-fault-if-a-large-buffer-is-used

    Feel free to give it a read, but the TL:DR version is that out of the box the NRF_LOG_RAW_HEXDUMP_ functions can handle buffers up to 1023 bytes long, if you set the configuration correctly in sdk_config.h

    In order to be able to write buffers up to 4095 bytes in length you can follow the instructions in that case.

    By default you can only write up to 160 byte buffers, but by increasing NRF_LOG_MSGPOOL_ELEMENT_COUNT in sdk_config.h you can increase the max length of the hexdumps, up to the absolute limits mentioned above. 

    The total limit is given by NRF_LOG_MSGPOOL_ELEMENT_COUNT * NRF_LOG_MSGPOOL_ELEMENT_SIZE. 

    Best regards
    Torbjørn

  • In all honesty it was the absolutely last place I was looking for a possible error. I got lucky yesterday and tried it with a different client and got a debug message which pointed to the location of the error and it was in the log c files. Total surprise. I was looking for measurement queues getting out of sync, incorrect handling of mutexes and having code events tramp on one another because it was taking to long to send, etc.  Comment out that dump and the code worked. Been messing with that for a week!

    Now I just loop over hunks and I format the hex string myself into pieces and use NRF_LOG. to print out the bytes. Works fine and I see all the bytes. One big problem down, one big one to go - why do I have to press the DK reset button to start my program. Uploads of the code (or power ups) always give a SoftDevice assertion error INVALID STATE. After the reset button program works. SOmething is bad somewhere.

  • Hi

    Annoyingly enough the most cryptic of bugs often have simple solutions, once you find it..

    I will try to give you a response in the other case later today, related to the power up issue, and consider this one resolved. 

    Best regards
    Torbjørn

Related