Facing HardFault issue while using SPI peripheral in any callback function in nrf Connect SDK 2.4.0

Hi, I want to implement a flow for callback/interrupt driven code. In the whole process I have to read and write data over SPI. Here I am using nrf52840 as SPI Master and ESP32 dev kit as Slave. When I try to communicate normally in a while in main.c everything works fine. But when I add the command for Send data over SPI in SPI callback handler , then I get hard fault error in next transaction while it goes to set the CS pin of SPI line.

This is the code which I am using to send command to slave

void readDataFromSlave(uint8_t slave_id, uint8_t *txBuffer, uint8_t txLength, uint8_t *rxBuffer, uint8_t rxLength)
{
	int err;

	struct spi_buf tx_spi_bufs[] = {
		{.buf = txBuffer, .len = txLength }
	};

	struct spi_buf_set spi_tx_buffer_set = {
		.buffers = tx_spi_bufs,
		.count = 1
	};

	struct spi_buf rx_spi_bufs[] = {
		{.buf = rxBuffer, .len = rxLength}
	};

	struct spi_buf_set spi_rx_buffer_set = {
		.buffers = rx_spi_bufs,
		.count = 1
	};

	if(slave_id == Cam1Slave)
	{
		gpio_pin_set(gpio0_dev,GPIO_0_CS,0);
		#ifdef USE_CB
			err = spi_transceive_cb(spi1_dev,&spi_cfg,&spi_tx_buffer_set,&spi_rx_buffer_set,SPI1CallbackHandler,rxBuffer);
		#else
			err = spi_transceive(spi1_dev,&spi_cfg,&spi_tx_buffer_set,&spi_rx_buffer_set);
		#endif
		
		gpio_pin_set(gpio0_dev,GPIO_0_CS,1);
	}
	
}

This is SPI callback function I get the  Hard fault issue when code reached and go for sending the command agani

void SPI1CallbackHandler(const struct device *dev, int result, void *data){
	
	if(result == 0){
		 printk("SPI device %s: Device data - %d ------Transceive operation successful\n", dev->name, *(uint8_t*)data);
		 checkEspReady(0);
		transCount++;
	}
	else{
	  	printk("SPI device %s: Device data - %d  -----Transceive operation failed\n", dev->name, *(uint8_t*)data);
	}
}
This is the overlay file for the project

// To get started, press Ctrl+Space to bring up the completion menu and view the available nodes.

// You can also use the buttons in the sidebar to perform actions on nodes.
// Actions currently available include:

// * Enabling / disabling the node
// * Adding the bus to a bus
// * Removing the node
// * Connecting ADC channels

// For more help, browse the DeviceTree documentation at https://docs.zephyrproject.org/latest/guides/dts/index.html
// You can also visit the nRF DeviceTree extension documentation at https://nrfconnect.github.io/vscode-nrf-connect/devicetree/nrfdevicetree.html

/ {
	chosen {
		nordic,nus-uart = &uart0;
	};
};

&pinctrl {

    spi1_default: spi1_default {
    group1 {
        psels = <NRF_PSEL(SPIM_SCK, 1, 8)>,
            <NRF_PSEL(SPIM_MOSI, 1, 9)>,
            <NRF_PSEL(SPIM_MISO, 0, 11)>;
		};
	};

	spi2_default: spi2_default {
		group1 {
			psels = <NRF_PSEL(SPIM_MISO, 0, 13)>,
           <NRF_PSEL(SPIM_MOSI, 0, 17)>,
           <NRF_PSEL(SPIM_SCK, 0, 16)>;
		};
	};
};

&spi1 {
	compatible = "nordic,nrf-spi";
	status = "okay";
	pinctrl-0 = <&spi1_default>;
	pinctrl-1 = <&spi1_sleep>;
	pinctrl-names = "default", "sleep";
};

&spi2 {
	compatible = "nordic,nrf-spi";
	status = "okay";
	pinctrl-0 = <&spi2_default>;
	pinctrl-1 = <&spi2_sleep>;
	pinctrl-names = "default", "sleep";
};

&uart0_default {
	group1 {
		psels = <NRF_PSEL(UART_TX, 0, 24)>, <NRF_PSEL(UART_RTS, 0, 5)>, <NRF_PSEL(UART_RX, 0, 25)>;
	};

	group2 {
		psels = <NRF_PSEL(UART_CTS, 0, 7)>;
	};
};

&uart0 {
	compatible = "nordic,nrf-uarte";
	status = "okay";
	current-speed = <115200>;
	pinctrl-0 = <&uart0_default>;
};

&spi3 {
    status = "disabled";
};

// &qspi_default {
//     group1 {
//         psels = <NRF_PSEL(QSPI_IO2, 0, 22)>;
//     };

//     group2 {
//         psels = <NRF_PSEL(QSPI_IO3, 1, 0)>,
//                 <NRF_PSEL(QSPI_IO0, 0, 23)>,
//                 <NRF_PSEL(QSPI_IO1, 0, 21)>,
//                 <NRF_PSEL(QSPI_CSN, 0, 3)>,
//                 <NRF_PSEL(QSPI_SCK, 0, 19)>;
//     };
// };

&qspi_sleep {
    group1 {
        psels = <NRF_PSEL(QSPI_IO2, 0, 22)>;
    };
};
/delete-node/ &{/pin-controller/spi2_sleep/group1/};

&spi1_sleep {
    group1 {
        psels = <NRF_PSEL(SPIM_MISO, 1, 8)>;
    };
};
/delete-node/ &{/pin-controller/pwm0_default/group1/};
/delete-node/ &{/pin-controller/pwm0_sleep/group1/};

// &led0 {
//     /delete-property/ gpios;
// };
// /delete-node/ &{/pin-controller/qspi_sleep/group2/};

// &led3 {
//     /delete-property/ gpios;
// };
&qspi_default {
    group1 {
        psels = <NRF_PSEL(QSPI_IO2, 0, 22)>,
                <NRF_PSEL(QSPI_IO1, 0, 21)>,
                <NRF_PSEL(QSPI_IO3, 1, 0)>,
                <NRF_PSEL(QSPI_SCK, 0, 19)>,
                <NRF_PSEL(QSPI_IO0, 0, 23)>,
                <NRF_PSEL(QSPI_CSN, 0, 3)>;
    };
};
&qspi {
	pinctrl-0 = <&qspi_default>;
	mx25r64: mx25r6435f@0 {
		compatible = "nordic,qspi-nor";
		reg = <0>;
		//changes for GD25Q32E
		//quad-enable-requirements = "S2B1v1";
		writeoc = "pp4o";
		readoc = "read4io";
		sck-frequency = <24000000>; //Note: Can't achieve 32MHz.
		jedec-id = [c8 40 16];
		size = <33554432>;
	};
};

SDK - nrf connect sdk 2.4.0

Please let me know if any more details required

Parents
  • Hi , Thanks for help but I have resolved the issue , I just have to disable the 

    CONFIG_ASSERT=n
     in the prj.conf file of project
    We can close this now
  • Hello,

    I am glad to read that this is not an issue for you anymore, but I am still curious which ASSERT it was that triggered?
    In general that asserts are added as guardrails for you to hit during development, so that the condition will not be an issue in release. I would therefore suggest that we take a closer look at the specific assert that triggered in this case, so that we can ensure that it is not something that will come back to cause trouble for you later in the development.
    In general it is not recommended to perform calls that could be blocking (or otherwise lengthy) as part of an event handler.

    Best regards,
    Karl

  • Thank you for clarifying, the assertion here happens due to the SPI attempting to wait indefinitely for the semaphor in the ISR context, which is not allowed.

    "The kernel does allow an ISR to take a semaphore, however the ISR must not attempt to wait if the semaphore is unavailable" this SPI driver cannot work."

    Since this is an issue with the Zephyr's SPI context I would raise the question in the Zephyr Discord, to ask what the best practice for working around this would be.
    The quickest and simplest solution to this issue would probably be to modify the K_FOREVER to K_NO_WAIT, but I would not generally recommend making changes to the drivers and infrastructure directly without having consulted the Zephyr community first.

    Best regards,
    Karl

  • Thanks Karl , It would be great if you can give some input or get . Actually disabling the ASSERT is not a best practice to follow. 

  • Sarvesh said:
    Thanks Karl

    No problem at all, I am happy to help! :) 

    Sarvesh said:
    Actually disabling the ASSERT is not a best practice to follow. 

    I agree with this, especially so since you might be missing other important asserts during your development as well.

    Sarvesh said:
    It would be great if you can give some input or get

    I would recommend that you go into the Zephyr Discord directly to ask their input on this - this way you can faster get the help you need with this.

    Best regards,
    Karl

  • I too have just been stung by this one.

    I know for other RTOSs' that one is not allowed to make blocking calls in ISR's; this is normal; I just am not sure for this Zephyr OS case.

    If the meat of the interrupt processing is being done from a worker thread; then I assume you can make blocking calls? This is the way the original driver for the SPI peripheral I am working with worked. GPIO Interrupt fired; ISR disabled further interrupt and posted job to workqueue where SPI transceive (which make a blocking call) is called.

    But driver I have has been hacked and now the SPI transceive is being called directly from the ISR. In fact the GPIO interrupt fires and the our SPI handler is called directly from interrupt context.

    As part of a merge of my code to this branch; the branch picked up CONFIG_ASSERT=y from my branch; hence the issue being revealed to me.

    Is what I have said correct and what is best practise here?

    I can see 2 design possibilities here....

    1) ISR processes the peripheral to "clear down" interrupt and posts the result to a worker thread for further processing.

    2) As in original driver ISR just disables peripheral interrupt and posts a job to the worker thread. All processing to be done at task/thread level.

    If ISR is to do any real work it needs to do SPI transactions to access peripheral registers; and we hit this assert.

    Could you advise? Did anything happen n discord?


  • Hello,

    OwainJangor said:
    Is what I have said correct and what is best practise here?

    You are correct that you should not make blocking calls as part of your ISR.

    The general advise here is that you should avoid computationally extensive work or blocking calls in your ISRs - instead, your ISRs should signal for the extensive work to be done as part of a lower priority workthread or similar, to avoid blocking all lower priority interrupts for the duration of the execution.
    This especially goes for blocking calls which you can not be exactly certain when will complete, like reception of data from another device.

    Something I should also note is that I have seen customers coming from projects with non-Nordic devices that are unaware of the easyDMA feature of the nrF52 and nRF53 series devices which lets you receive data through the serial interfaces without having the CPU actively receiving it - this of course also means that you do not need to use blocking calls to receive your serial data. This also goes for transmission of serial data.

    OwainJangor said:
    Did anything happen n discord?

    The Zephyr Discord is open to join, and so if Sarvesh does not come back with a reply here you could join the server yourself and look for the thread on this matter :) 

    Best regards,
    Karl

Reply
  • Hello,

    OwainJangor said:
    Is what I have said correct and what is best practise here?

    You are correct that you should not make blocking calls as part of your ISR.

    The general advise here is that you should avoid computationally extensive work or blocking calls in your ISRs - instead, your ISRs should signal for the extensive work to be done as part of a lower priority workthread or similar, to avoid blocking all lower priority interrupts for the duration of the execution.
    This especially goes for blocking calls which you can not be exactly certain when will complete, like reception of data from another device.

    Something I should also note is that I have seen customers coming from projects with non-Nordic devices that are unaware of the easyDMA feature of the nrF52 and nRF53 series devices which lets you receive data through the serial interfaces without having the CPU actively receiving it - this of course also means that you do not need to use blocking calls to receive your serial data. This also goes for transmission of serial data.

    OwainJangor said:
    Did anything happen n discord?

    The Zephyr Discord is open to join, and so if Sarvesh does not come back with a reply here you could join the server yourself and look for the thread on this matter :) 

    Best regards,
    Karl

Children
  • Hi Karl,

    I refactored  code to do the meaty stuff in a thread; all ISR now does is signal the thread to run. This has allowed turning asserts back on; all seems to be working well at the moment.

    Main processor is STM, ble processor is Nordic; but it is interesting what you mention about the easyDMA.

    Regards,
    Owain

  • Hello Owain,

    OwainJangor said:
    I refactored  code to do the meaty stuff in a thread; all ISR now does is signal the thread to run. This has allowed turning asserts back on; all seems to be working well at the moment.

    Thank you for the update - I am happy to hear that you have aligned your code with the best practice for this and that you are no longer having an issue with this! :) 

    OwainJangor said:
    Main processor is STM, ble processor is Nordic; but it is interesting what you mention about the easyDMA.

    Aha, I understand. I dont have any personal experience with STM and so I cant speak to their features or capabilities, but in general I can say that the easyDMA indeed is a very useful feature when working with asynchronous memory processes in Nordic devices, at least.

    Best regards,
    Karl

Related