This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Receiving an NRF_FAULT_ID_SD_ASSERT error some non-deterministic time after connecting via BLE

I am constantly receiving an NRF_FAULT_ID_SD_ASSERT error within a non-deterministic time after connecting via BLE.

The program Counter is always 75936 and Info == 0. I am using the SDK 14.2 and the softdevice 5.0 (which is not perfectly in line with the compatibility matrix, although I downloaded this configuration in January 2018 "as it is" from your site...).

I enclosed the sdk_config.h, because this was often the the cause of problems in the past.

I would be glad if you could provide me some hints where to look at; hopefully I am able to do the rest.

Thank you in advance, JulianMA.

sdk_config.h

  • Hi,

    I have seen two previous reports of asserts at that particular address. One of them was due to an incorrectly loaded crystal, and the other one is unfortunately still a mystery. As I wrote in the comments here, your assert indicates that something is messing with the mechanisms that the Softdevice uses to keep time. To keep time, the Softdevice uses RTC0 and TIMER0 and both peripherals use interrupts that run in priority 0 (highest). So the question is how your application is able to disrupt these timing critical things? It could be one of two root causes:

    1. An inaccurate clock causes the nRF52's time to tick too slow or fast relative to the rest of the world.
    2. Something in your application is messing with the Softdevice's RTC and Timer interrupts, preventing them from being executed on time. 

    Some questions:

    1. Can you give us some more details about what kind of application your making?
    2. I understand the assert seems to happen randomly, but is it possible to say anything at all about what the application is doing when the asserts occur? Is it advertising? In a connection? Processing something? Etc..?
    3. Are you using a development kit or a custom PCB?
    4. Are you using a lot of interrupts in your application?
    5. Are you using the time slot API?
    6. Is this happening on more than on device? If so what is the failure rate?
    7. Are you doing flash operations in your application?
    8. Are you at any point disabling interrupts to execute time critical code? (some drivers and libraries in the SDK do this under the hood)

    Please also read through the case I linked to above and see if you can spot some similarities.

  • Hi MartinBL,

    many thanks for your fast reply! I think that you provided lots of help already: I got the message that this problem is likely to be related to "timing" issues.

    @ About my application / hardware:

    My application is currently run on custom PCB which follows Nordic's design guides as close as possible. It is in an absolute development state. I have three slightly different layouts which are behaving identical with respect to the "assert".

    @ Purpose of my hardware:

    The application is intended to read values from a sensor for ultraviolet radiation via TWI and to notify the sensor readings via BLE (peripheral to central). I adopted your heartrate-peripheral application (ble_app_hrs) and modified it to meet my requirements (add new characteristics, etc.).

    @ What is the application doing during failure:

    The application is in the "connected" state. That is all I can say; I assume that the ASSERT is triggered by a lower-level system interrupt.

    @ Problems with TWI:

    --> can be excluded, since I commented all TWI issues out and the problem persists.

    @ Are you using a lot of interrupts in your application?

    No.

    @ Are you using the time slot API?

    No.

    @ Is this happening on more than on device? If so what is the failure rate?

    It happens on all my three custom devices. All of them are slightly different but as close as possible to the Nordic design guides.

    @ Are you doing flash operations in your application?

    Only in the peer manager (e. g. for bonding). This should only be done during pairing and should therefore not be the reason for the assert.

    @ Are you at any point disabling interrupts to execute time critical code? (some drivers and libraries in the SDK do this under the hood)

    Hmmm. I do not disable interrupts by myself; however if this is done under the hood of SDK libraries, I cannot exclude this.

    @ As a result I see the following tasks as my "homework":

    1. I will test another hardware (e. g. the "Sparkfun Breakout Board")

    2. I will cut down the application in order to reduce the number of timers and anthing that has to do with "time"

    3. I will modify the available clock sources in the sdk_config.h.

    4. If I manage to identify a solution I will notify you.

  • Happy to help. 

    Testing on a development kit sounds like a good idea. Then we should be able to rule out HW or clock accuracy issues due to crystals being loaded incorrectly. Have you tried to run other BLE examples from the SDK on your custom devices?

    The other things you mention are also good ideas. Please keep me posted.

    One more thing. The customer in the mystery case that I mentioned earlier reported that the issue disappeared when they migrated from SDK 14.0.0 to SDK 14.2.0. They also used the Logger library in the SDK and while they ported to SDK 14.2.0, they also implemented the bug fix that my colleague Krzysztof Chruscinski has posted here. If you are using the logging library as well, can you also please try to implement the fix? 

    Have a good weekend.

    -Martin

  • Dear MartinBL,

    regarding this issue, it turns out that my hardware causes the instabilities. I investigated my hardware even with an electrical engineer (I am just a physicist), but no major cause could be determined (e. g. the oscillator is loaded correctly, etc.). In any case, I slightly deviate from Nordic's reference design.

    However, the main point for me is that I can continue to develop the software -- which can easily be done on a development board.

    Thank you so far for your effort.

  • Ok, good to know. So as I understand it we can consider the case as closed?

    -Martin

Related