ZBOSS NCP error code 22 "NO_MEMORY"

Hi, 

I am using an nRF52840 as a Zigbee NCP. I have built the firmware myself using nRF Connect, and I am also running MCUBoot. Sometimes after some usage, the chip starts returning error code 22 meaning "NO_MEMORY". From that point, it is not usable any more until i reset the module. I was not able to figure out a reliable way to recreate the issue. What is the cause, and what can be the solution to the problem?

Parents
  • Hello,

    Can you please provide a bit more information. What function is it that returns -22? 

    error code 22 meaning "NO_MEMORY"

    Usually -22 means EINVAL, but it may vary, based on what function returning it. ENOMEM is typically -12.

    But let me know what function that starts returning 22, and we can have a look at it. Preferably also some context/snippets describing how it is called, and with what parameters.

    Best regards,

    Edvin

  • Normally, it works fine, there is a point from which a lot of different function calls start returning this error code. For example, when calling zb_zdo_mgmt_permit_joining_req like this

    void mgmt_permit_joining_req_cb(zb_uint8_t param)
    {
        zb_zdo_mgmt_permit_joining_resp_t *resp;
        resp = (zb_zdo_mgmt_permit_joining_resp_t*)zb_buf_begin(param);
        TRACE_MSG(TRACE_APP1, "mgmt_permit_joining_req_cb status: %hd", (FMT__H, resp->status));
        zb_buf_free(param);
    }
    void mgmt_permit_joining_req_p(zb_uint8_t param)
    {
        zb_zdo_mgmt_permit_joining_req_param_t *req_param = ZB_BUF_GET_PARAM(param, zb_zdo_mgmt_permit_joining_req_param_t);
        ZB_BZERO(req_param, sizeof(zb_zdo_mgmt_permit_joining_req_param_t));
        req_param->dest_addr = zb_get_short_address();
        req_param->permit_duration = ZB_BDBC_MIN_COMMISSIONING_TIME_S;
        req_param->tc_significance = ZB_TRUE;
        zb_zdo_mgmt_permit_joining_req(param, mgmt_permit_joining_req_cb);
    }

    The returned status is 22. But all other mgmt and APS calls also return 22 at this point, like trying to read or write an attribute.

    What I was able to find out is that the NCP itself is returning the error, and according to this document I found, it means NO_MEMORY: 
    ZBOSS_NCP_Serial_Protocol.pdf

    So my theory is that the memory of the NCP gets full after some time. But what was strange, is that I have tried running long-term tests also with a script, like reading and writing attributes on a device repeatedly, and then it did not occur. What I do not know, is what causes it to get stuck like that, or when it does get stuck, how is it possible to clean the memory for example, if that is the problem.

  • noffz_zkalman said:
    What I was able to find out is that the NCP itself is returning the error, and according to this document I found, it means NO_MEMORY: 

    Ah, yes. Look at that. I see they are also defined in ncs\nrfxlib\zboss\production\include\zb_errors.h.

    I agree. It sounds like you are running out of buffers in some way, and it is not necessarily the mgmt_permit_joining_req_p() calls that causes this to happen, but they are the ones struggling to find free buffers. I have not seen your application, but I believe the first to look for are callbacks, like the one you have in mgmt_permit_joining_req_cb(). Do you have any of these that doesn't free up the buffer in the end, using zb_buf_free(param)?

    Alternatively, you can keep searching for the function that exhausts your buffers. Perhaps there is a specific cluster you are reading, and this callback doesn't free the buffer?

    If you want us to have a look, feel free to upload your application, then I can have a look, and if I can't find anything, I can ask if our Zigbee team has the time to have a look. 

    Are you able to reproduce the issue using 2-3 DKs? If so, it is easier for us to reproduce the issue, and see what's going on.

    Best regards,

    Edvin

  • The application I sent you the snippet of, is not running on the nRF52, it is a ZBOSS NCP Host application running on a linux system. The memory error is not occuring in my code, I have encontered issues like that before, but now it is not the case. 

    The code that is throwing the error is the Zigbee NCP sample application, that can be found in the nRF Connect SDK. I basically just built the sample application, and uploaded it to an nRF52840DK.

    What I found since then, is that the issue mostly occurs whenever I am using the device as a coordinator, and try to connect multiple devices to it at the same time. So my theory is that the NCP application runs out of available buffers, and that is what causes the issue.

    I have tried modifying the NCP sample to increase the number of available buffers. What i have found, is that including this in the application increases buffer count:


    #define ZB_CONFIGURABLE_MEM
    #include "zb_mem_config_max.h"
    But after some testing, the problem still persisted.
    The issue is, that as far as I have seen, the code of the application is not open source, so I can not look deeper into where exactly this can occur.
    I will try to reproduce the issue only using a number of DKs, I will get back to you if I succeed.
    Best Regards,
    Zalan
Reply
  • The application I sent you the snippet of, is not running on the nRF52, it is a ZBOSS NCP Host application running on a linux system. The memory error is not occuring in my code, I have encontered issues like that before, but now it is not the case. 

    The code that is throwing the error is the Zigbee NCP sample application, that can be found in the nRF Connect SDK. I basically just built the sample application, and uploaded it to an nRF52840DK.

    What I found since then, is that the issue mostly occurs whenever I am using the device as a coordinator, and try to connect multiple devices to it at the same time. So my theory is that the NCP application runs out of available buffers, and that is what causes the issue.

    I have tried modifying the NCP sample to increase the number of available buffers. What i have found, is that including this in the application increases buffer count:


    #define ZB_CONFIGURABLE_MEM
    #include "zb_mem_config_max.h"
    But after some testing, the problem still persisted.
    The issue is, that as far as I have seen, the code of the application is not open source, so I can not look deeper into where exactly this can occur.
    I will try to reproduce the issue only using a number of DKs, I will get back to you if I succeed.
    Best Regards,
    Zalan
Children
  • Hello Zalan,

    I understand. It would be really helpful if we are able to reproduce the issue one way or another.

    My understanding is that the NCP device will do exactly as it is told, so it may be that your Host application needs to tell the device to free up the buffers in some callback. Perhaps you can look over the callbacks used in the coordinator process when you "try to connect multiple devices to it at the same time".

    Best regards,

    Edvin

Related