Beware that this post is related to an SDK in maintenance mode
More Info: Consider nRF Connect SDK for new designs
This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Application not running after BLE DFU

Hello,

my current problem is, that my application is not running when flashing it via BLE DFU using nrf52832 on BLE Nano V2 and nrfConnect. But the same code runs fine, when flashing the application over SWD (the DAPLink USB Companion Board of the BLE Nano). So I tried to comment out the code I've added last time and tried to transfer the application once again over BLE and it worked..... I added the code step by step to my program again and tested the DFU update mechanism after every step. At some point the application is not starting anymore after flashing it via BLE DFU. But I don't think, that this is specific to my applications source code, because the application runs without problems when flashing over cable. Could it be a memory problem while updating? Because if code is comment out the application size is smaller of course...

I'm using sdk v15, s132 v6, eclipse, gcc and I'm only updating the application not the softdevice or the bootloader.

While searching for the error, I tried to enable the logging functionality with UART as backend in the bootloader, which is basically the example from the sdk. But after modifying the makefile and the sdkconfig.h I get a linker error which says:

_build/nrf52832_xxaa_s132/nrfx_uarte.c.o (symbol from plugin): In function `m_nrf_log_UARTE_logs_data_const':
(.text+0x0): multiple definition of `UARTE0_UART0_IRQHandler'
_build/nrf52832_xxaa_s132/nrfx_uart.c.o (symbol from plugin):(.text+0x0): first defined here

I just added the header and source files to the bootloader project, necessary for logging over uart, I did no changes in the code.

If someone could help me to solve the linker problem, maybe I get one step further with the DFU problem ;)

EDIT: The DFU Update works now. But I don't know exactly why... In the meantime I worked at the same program on a button for entering sys off state. While testing this, I found out that the BLE Update works now, the transfered application runs. But I also found out, another part of my code doesn't work which I didn't touched without any error messages.... I think there's one big general problem, but I can't see the context... It's weired.....

Thanks in advance

Markus

Parents
  • Hi Markus,

    In essence you see different unrelated issues that come and go when you do seemingly unrelated changes to your application? What is the state of your application when you say it is not running? Can you check with a debugger what is going on? Is it for example in a error handler, or is it waiting for something that never happens? It is not easy to point at a cause in this case, but there are a few possible reasons worth investigating in more detail:

    • Interrupt priority issue: could it be that you have some code running in an interrupt context that is waiting for something that is going to be done in a lower interrupt priority or main context? If so, this would result in a lockup.
    • Memory corruption: Perhaps you for some reason overwrite some memory that is used for something else (accessing an array out of bonds or similar). If so, anything could happen depending on what memory was overwritten.
  • In essence you see different unrelated issues that come and go when you do seemingly unrelated changes to your application?

    Yes you can say so..

    What is the state of your application when you say it is not running? Can you check with a debugger what is going on?

    Sorry I can't check it anymore, because this issue no longer occurs. But the device is not advertising and the connected devices vai TWI and UART are not working.

    I'll check my code against your suggestions. Can you give me more examples what else could cause a memory corruption?

Reply Children
  • I debugged the application and found out, that I get a segmentaion fault when accessing a particular function. When calling this function from main() no problem occurs, if accessing it from the uart- or the app-timer-event-handler the segmentation fault happens. The function uses twi. The priority of the twi interrupt is higher than the uart- or app-timer-irq-priority. Is it generally possible to call functions like that from a event handler?

  • It seems you are getting closer to the issue. The most typical interrupt priority issue is a lockup due to a higher priority thread waiting for something that should be done by a lower priority thread. How have you verified the fault?

    If you see a fault, then I am inclined to think that memory corruption could still be the problem, but it is difficult to say anything with confidence without a deeper understanding of your application. Is there any data shared between these functions that could be in a bad state if one is interrupted at the wrong time? Or perhaps the issue could be stack overflow issue. Do you have data located right beneath the stack (memory address vise)? Can you increase the stack size and see if it helps?

  • Hi,

    There was an issue with multiple duplicates of your latest post, and for some reason all got deleted during clean-up. I will try to get it back, but I cannot promise anything. Did you by any chance copy the content so that you can post it again if I am not able to retrieve the deleted post? I am sorry for the inconvenience.

  • I didn't copy it, I have to rewrite the post:

    Priorities are checked, they are 7 for the app-timer, 7 for the uart and 6 for the twi, because the application should originally send twi messages from the uart- and the app-timer-handler (for test purposes in the handler only a flag is set and in the endless loop in main the flag gets polled). Priotities of the softdevice are untouched. I hope I haven't forgotten any priority.

    I doubled the stacksize in the makefile

    nrf52832_xxaa: CFLAGS += -D__HEAP_SIZE=8192
    nrf52832_xxaa: CFLAGS += -D__STACK_SIZE=16384
    nrf52832_xxaa: ASMFLAGS += -D__HEAP_SIZE=8192
    nrf52832_xxaa: ASMFLAGS += -D__STACK_SIZE=16384

    Is this everything I have to do to get more space for the stack or is there anything else to do? Are these values realistic? At the end of building the application I get

     text	   data	    bss	    dec	    hex	filename
    56792	    660	   3628	  61080	   ee98	_build/nrf52832_xxaa.out

    I'm using a softdevice and a bootloader.

    Where can I get the information which data is located beneath the stack or generally in RAM?

    The part of the code which seems to make trouble is  the following.
    This code is located in main.c and calls the function drv2605_setWaveform from main() and from enter_sleep():

    volatile bool button_released = false;
    volatile bool get_akku = false;
    
    void enter_sleep(void)
    {
    	uint32_t err_code;
    	
    	err_code = app_timer_stop_all();
    	APP_ERROR_CHECK(err_code);
    
    	uint8_t waves2[] = { 1 };
    	drv2605_setWaveform(waves2, sizeof(waves2)/sizeof(uint8_t));
    
        err_code = nrf_sdh_disable_request();
        APP_ERROR_CHECK(err_code);
    }
    
    
    int main(void)
    {
        uint8_t confirmWave[] = { 1 };
    	drv2605_setWaveform(confirmWave, sizeof(confirmWave)/sizeof(uint8_t));
    	
    	for (;;)
        {
        	if (button_released == true)
        	{
        		button_released = false;
        		enter_sleep();
        	}
        	if (get_akku == true)
        	{
        		get_akku = false;
        		update_akku();
        	}
        	if (NRF_LOG_PROCESS() == false)
    		{
    			nrf_pwr_mgmt_run();
    		}
        }
    }

    drv2605_setWaveform is located in drv2605.c in which the segmentation fault comes up sometimes.  Mostly it seems to work fine when called from main() and it is almost never working when called from enter_sleep().

    void twi_writeDRV2605Register(uint8_t reg, uint8_t data)
    {
    	uint8_t data_comb[2] = { reg, data };
    	twi_transmit(DRV2605_ADDR, data_comb, 2, false);
    }
    
    
    void drv2605_setWaveform(uint8_t waveIDs[], uint8_t length)
    {
    	uint8_t val_to_write = 0x00;
    
    	if (length > 8) {
    		return;
    	}
    	else {
    		for (uint8_t slot = 0; slot < 8; slot++) {
    			if (slot < length) {
    				val_to_write = waveIDs[slot];
    			}
    			else {
    				val_to_write = 0x00;
    			}
    			twi_writeDRV2605Register(DRV2605_REG_WAVESEQ1 + slot, val_to_write);
    		}
    	}
    }

    twi_transmit is located in twi_set.c because it is also used from another part of the application which handles another device on the twi bus, so also m_xfer_done, m_twi and the twi_handler is shared:

    static const nrf_drv_twi_t m_twi = NRF_DRV_TWI_INSTANCE(TWI_INSTANCE_ID);
    static volatile bool m_xfer_done = false;
    
    void twi_handler(nrf_drv_twi_evt_t const * p_event, void* p_context)
    {
        switch (p_event->type)
        {
            case NRF_DRV_TWI_EVT_DONE:
                m_xfer_done = true;
                break;
                
            default:
                break;
        }
    }
    
    
    void twi_wait_for_transfer(void)
    {
    	while (m_xfer_done == false);
    	m_xfer_done = false;
    }
    
    
    void twi_transmit(uint8_t dev_address, uint8_t* p_data, uint8_t length, bool no_stop_bit)
    {
    	ret_code_t err_code;
    	err_code = nrf_drv_twi_tx(&m_twi, dev_address, p_data, length, no_stop_bit);
    	APP_ERROR_CHECK(err_code);
    
    	twi_wait_for_transfer();
    }

    The code has changed slightly compared to the old post, but the poblem is still the same ;)

  • Thank you for writing the post again.

    Can you elaborate exactly what you mean by segmentation fault? Can you use a debugger to see the state of the system when and right before this happened? Is any error check hit (APP_ERROR_CHECK with non-zero value)? If so, the default error handler (app_error_fault_handler) should tell you more about what happened if you build the application with DEBUG defined.

    Regarding the stack, it's placement is toolchain dependent. And the placement of other data is application dependent. You can for example check the .map file, or the graphical representation in some IDE's (like the Memory Usage view in Segger Embedded Studio). It does not seem likely that a stack overflow is the issue here, though.

Related