This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

sd_softdevice_enable does not return

On one of our products, we are experiencing approximately 4% failures during production.  I traced the problem to be caused by function sd_softdevice_enable not returning.  This problem does not occur if I have a Segger JTAG probe connected or if the internal LF 32 KHz RC oscillator is used instead of the external crystal.  But with the external 32 KHz (20 ppm) crystal oscillator, about 4% of boards do not return from the sd_softdevice_enable function.  On the problem boards, our hardware engineers believe that this LF external oscillator is starting up normally.  They do not observe any startup differences between a good board or a bad board.  I would like to know the reasons why the function sd_softdevice_enable might not return.

Parents
  • Hi,

    The only 'wait state' in the softdevice enable is the wait for the lfclk started event. So likely the external crystal do not start as expected.

    Best regards,
    Kenneth

  • Hi Kenneth,

    We've looked at a number of units - some working, some non-working - and the 32kHz crystal starts-up identically in both scenarios (see attached screenshots). We've also verified that the parameters of the oscillator operation are accurate (frequency, amplitude, offset).

    Furthermore, swapping the crystal and capacitors does NOT cause a defective board to start functioning and moving the crystal+caps from a defective board to a good board doesn't affect functionality.

    What would be the next things to look at that could be causing a malfunction of the softdevice_enable function?

    Best,

    Eiad Jandali

    Hardware Engineer

    XTAL start-up on a functioning board

    XTAL start-up on a non-functioning board

  • Hi Kenneth,

    This solves the problem on the problems board that I have.  We need to test this on a larger population of bad boards.  I want to get a better understanding of what is happening.  I assume that the function sd_softdevice_enable is putting the processor into a sleep mode waiting for the external LF oscillator to start up and when the processor wakes up, memory might be corrupted, so the function might not be able to return.  In the same line of reasoning, if the external LF oscillator is started prior to the call to sd_softdevice_enable, the function sd_softdevice_enable no longer need to wait for the oscillator to start because it already running, so the processor is never put into a sleep mode, so this function always returns.  Is my understanding correct? 

    Thanks,

    Wayne

  • wouchida said:
    In the same line of reasoning, if the external LF oscillator is started prior to the call to sd_softdevice_enable, the function sd_softdevice_enable no longer need to wait for the oscillator to start because it already running, so the processor is never put into a sleep mode, so this function always returns.

    That is likely, but not certain. I know there was a change at some point, from softdevice would always re-init the clock, to keep the LF clock running if it was already running.

    Best regards,
    Kenneth

  • Kenneth,

    Thanks for the help in resolving this issue.

    Is this errata_108 issue chip specific (i.e. swapping the existing part with a new part should solve it)?

    If that's the case, does this issue occur on all date or build codes or only a subset?

    Thanks!

    Eiad Jandali

  • All variants of the nRF82832 can have this issue (for instance it may vary over temperature), and thereby all applications running on the nRF52832 should apply this workaround yes.

    Best regards,
    Kenneth 

  • Kenneth,

    Ok, we have a different version of the product which essentially the same circuit with a different PCB layout and mechanical enclosure - we have NOT observed the issue with that product. Any idea's why?

    With the product experiencing failures, we are seeing around 4% of units impacted - does that make sense based on Nordics data on this issue? Some context hear will help us understand is errata_108 is responsible for all of our issues or if there is a second effect we missing.

    Can this issue spread with the working boards (Is errata_108 susceptibility something that could increases with usage time)? This will help dictate how we deal with existing inventory that already has firmware installed.

    Thanks,

    Eiad Jandali

Reply
  • Kenneth,

    Ok, we have a different version of the product which essentially the same circuit with a different PCB layout and mechanical enclosure - we have NOT observed the issue with that product. Any idea's why?

    With the product experiencing failures, we are seeing around 4% of units impacted - does that make sense based on Nordics data on this issue? Some context hear will help us understand is errata_108 is responsible for all of our issues or if there is a second effect we missing.

    Can this issue spread with the working boards (Is errata_108 susceptibility something that could increases with usage time)? This will help dictate how we deal with existing inventory that already has firmware installed.

    Thanks,

    Eiad Jandali

Children
  • Hi Eiad, 

     

    Kenneth is on vacation so I will try to help. 

    As far as I know, your 4% error rate is in the ballpark of with what we observed. We recommend to implement the errata fix in all of your product.

    Of the "working product" you have now, if it's working properly it doesn't mean that the chip doesn't have the issue, but it's possible that the corrupted memory is at a location that it doesn't affect the current normal operation of the device , at this moment. It can be guaranted that it will always works fine in the future. 

  • Questions:
    1. I assume that for errata 108, the following code segments from a later SDK can be used. Is this correct?

    2. Currently, we are using nRF52832 chip, revision Rev 1 (QFAA-B00) and will be eventually transitioning to Rev 2 (QFAA-E00).  Will the below fix and the errata fixes for SDK_11 still work for the Rev 2 hardware?  In other words, will the hardware version check still be satisfied for to Rev 2 hardware?

    /* Workaround for Errata 108 "RAM: RAM content cannot be trusted upon waking up from System ON Idle or System OFF mode" found at the Errata document for your device located at infocenter.nordicsemi.com/ */
    if (errata_108()){
            *(volatile uint32_t *)0x40000EE4 = *(volatile uint32_t *)0x10000258 & 0x0000004F;
    }

    static bool errata_108(void)
    {
        if ((((*(uint32_t *)0xF0000FE0) & 0x000000FF) == 0x6) && (((*(uint32_t *)0xF0000FE4) & 0x0000000F) == 0x0)){
            if (((*(uint32_t *)0xF0000FE8) & 0x000000F0) == 0x30){
                return true;
            }
            if (((*(uint32_t *)0xF0000FE8) & 0x000000F0) == 0x40){
                return true;
            }
            if (((*(uint32_t *)0xF0000FE8) & 0x000000F0) == 0x50){
                return true;
            }
        }

        return false;
    }

    Thanks,

    Wayne

  • Hi Wayne, 

    Yes, it's the correct workaround. 

    The errata is remained on Rev2 and the above code can be used for Rev 2 as well. 

     

     

Related