This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

General Error Handling & avoiding Fatal Error

Hello,

(I am using SDK 15.1 on a linux system)

I will breakdown the question in two parts -

  1. when do we get fatal ? - i know one or two cases for example when softdevice doesn't have enough memory to execute the given operation but are there certain known scenarios/cases when we get it?
  2. second question is about generic error handling - i would like to know what people generally follow to avoid fatal conditions again i might know a few (probably not so good ways).
    • for example - when your call required a system packet to be sent out but queue is full so you loop the call till it returns NRF_SUCCESS etc.
    • or when our debug flag is not set and the APP_ERROR_CHECK calls its handler and it eventually calls to reset the the system.

These questions are mainly from the point of view of putting devices into production.

Thanks

EDIT :

3. referring this error module link

and the file attached below,

  • I was not able to find where DEBUG is defined
  • Also whenever it prints "fatal error" according to the file it should also print "System Reset" and do a system reset - neither of which happens.

What am I missing or where am I going wrong ?

Parents
  • Hello,

    Check out this blog post:

    https://devzone.nordicsemi.com/b/blog/posts/an-introduction-to-error-handling-in-nrf5-projects

    So "fatal error" comes from an APP_ERROR_CHECK(err_code); with err_code != NRF_SUCCESS

    DEBUG should be defined in your preprocessor defines. If you are not sure where that is, google "preprocessor defines <your IDE>.

    If you have enabled NRF_LOG, it should also print in the log where the APP_ERROR_CHECK(err_code) that received an err_code != NRF_SUCCESS comes from, and what the return value was.

    Regarding your second question, that depends on what the error was. To make it "simple/stupid": Don't do anything wrong. That turns out to be difficult, so the APP_ERROR_CHECK() is a tool that helps you check that you do everything in the right order, and that the variables that you pass into your function calls are valid. 

    Typical errors could be:

    - Using a module that is not initialized

    - Trying to send too much data into a module (too long advertising data, too many packets into a queue, etc.).

    - Using a module that you can't use while the softdevice is enabled.

    and many more. 

    So try to define DEBUG and see whether you can find out what the error is. When you find out what function that returned the err_code != NRF_SUCCESS (=0), look at the header file that defines this function, or look it up on infocenter. That should give you some hints to why it doesn't work.

    Best regards,

    Edvin

Reply
  • Hello,

    Check out this blog post:

    https://devzone.nordicsemi.com/b/blog/posts/an-introduction-to-error-handling-in-nrf5-projects

    So "fatal error" comes from an APP_ERROR_CHECK(err_code); with err_code != NRF_SUCCESS

    DEBUG should be defined in your preprocessor defines. If you are not sure where that is, google "preprocessor defines <your IDE>.

    If you have enabled NRF_LOG, it should also print in the log where the APP_ERROR_CHECK(err_code) that received an err_code != NRF_SUCCESS comes from, and what the return value was.

    Regarding your second question, that depends on what the error was. To make it "simple/stupid": Don't do anything wrong. That turns out to be difficult, so the APP_ERROR_CHECK() is a tool that helps you check that you do everything in the right order, and that the variables that you pass into your function calls are valid. 

    Typical errors could be:

    - Using a module that is not initialized

    - Trying to send too much data into a module (too long advertising data, too many packets into a queue, etc.).

    - Using a module that you can't use while the softdevice is enabled.

    and many more. 

    So try to define DEBUG and see whether you can find out what the error is. When you find out what function that returned the err_code != NRF_SUCCESS (=0), look at the header file that defines this function, or look it up on infocenter. That should give you some hints to why it doesn't work.

    Best regards,

    Edvin

Children
  • Thanks Edvin,

    One more thing, whenever I get "Fatal Error" (ie. when DEBUG isn't defined) I never get "System reset" and the system doesn't actually reset, why is that ? according to the snippet below i should right?

    Regarding the other points, I also really wanted to know how to handle when your device (custme boards in this case) will be sealed shut and in production, but the only reliable way seems to be just to go for a system reset rather than changing states of the application for example if we get stuck in connected state. disconnect and do the whole process again or loop the operation giving the error until it succeeds or till a fixed number of times

  • Hello,

    This is because of the line before #ifndef DEBUG:

    NRF_BREAKPOINT_COND;

    Which will generate a breakpoint if a debugger is connected. If you use the DK, it typically will be, because you power the DK via the debugger.

    If you try to comment out NRF_BREAKPOINT_COND, you will see that it will reset.

    Note that it shouldn't do this if you are not debugging, but I believe these registers (that set the breakpoint when you are debugging) are only set on power on startup. Try to turn the board off and on while an application that has not defined DEBUG in the preprocessor defines is programmed. Then it should restart at this point.

    You have to decide whether or not to use APP_ERROR_CHECK() in your final product. You may, but note that many of the examples reset on conditions that doesn't need a reset. E.g. in the ble_app_uart example, which uses APP_ERROR_CHECK() if ble_nus_data_send() returns NRF_ERROR_RESOURCES, which means that the buffer is full. In many cases, you will choose to use ble_nus_data_send() until the buffer is full. 

    So you don't have to pass all return values into APP_ERROR_CHECK(). You have to choose based on your application. Of course, sometimes a reset is probably the way to go, e.g. if you are in a deadlock (although a watchdog timer may also be used for this).

    Best regards,

    Edvin