Using a failed malloc inside my secure partition (TF-M) reboots the board instead of returning an error

Hi,

I'm using the following function in my project, in order to get an idea of how much memory can be reserved with malloc:

static uint32_t GetFreeMemorySize()
{
  uint32_t  i;
  uint32_t  len;
  uint8_t*  ptr;
 
  for(i=1;;i++)
  {
    len = i * 1024;
    ptr = (uint8_t*)malloc(len);
    if(!ptr){
      break;
    }
    free(ptr);
  }
 
  return i;
}

This function should return the last value of i that was tested.

When I use it from the main.c of my application (from the non-secure partition), it works correctly. However, if I launch it inside my secure partition (TF-M), the function works correctly until before the last malloc. But, in this last malloc, instead of failing and terminating the function, the board reboots.

I don't know if there is a default setting that causes a failed malloc to restart the board, but I would like this not to happen, and get the return value of i.

I know that the function runs inside my safe partition. I have verified this by adding a printf inside the function and observing the output of TF-M, as can be seen below:

static uint32_t GetFreeMemorySize()
{
  uint32_t  i;
  uint32_t  len;
  uint8_t*  ptr;
 
  for(i=1;;i++)
  {
    len = i * 1024;
    ptr = (uint8_t*)malloc(len);
    if(!ptr){
      break;
    }
    free(ptr);
	printf("GetFreeMemorySize (from SPM) -> i = %d\n", i);
  }
 
  return i;
}

Regards,
Pablo

Useful information about the project:
- I'm using the nRF5340DK development kit.
- nRF Connect SDK v2.0.0
- The project is based on the example: TF-M Secure Partition Sample.

  • Hi Pablo, 

    Could you provide the project and files you used for the TF-M setup ? Did you use IPC API or use the TFM library ? 
    From my understanding it might be the feature of TF-M that it doesn't allow you to malloc out size of the secure area and will trigger a hardfault. This explain the board reset you observed.  


    I will check with the team here. Would be nice to have your project source to test. 

    Have you tried to configure 

    CONFIG_RESET_ON_FATAL_ERROR=n
    And check if you have any log ? 



  • Hi,

    I am using TFM library. It would be desirable to use IPC API?

    I have reproduced the same problem in the tfm_secure_partition example, in order to separate it from other possible problems in my project. I builded the application for ncs 2.0.0.

    I attach the tfm_secure_partition project with the GetFreeMemorySize function so you can reproduce the problem. You will be able to see that it runs first on the main (non-secure partition) and then fails when it runs on the secure partition.

    Project: tfm_secure_partition.zip

    I have added the CONFIG_RESET_ON_FATAL_ERROR=n option and this is what is shown in the outputs:

    Application output:



    TF-M output:

    What I want to do is that the function ends and returns a value, as it does from the non-secure partition. I would like to know how to prevent malloc from causing that error, or know if there is another alternative to malloc.

    Regards,

    Pablo

  • Thanks Pablo for the code. As you call the function directly inside tfm_dp_secret_digest_req() I guess it has nothing to do with the IPC or TF-M library. 
    I have reproduced the same issue here and also checked internally with our team but we don't know what caused the fault. 

    This issue is a little bit outside of our knowledge. I would suggest to post the question to TF-M  

  • Hi Hung,

    Thank you for testing the code. I will try to post the question to TF-M.

    However, to try to get more information about the error, shouldn't some message be displayed when a hardfault occurs? Is there a configuration to enable that?

    Best regards,

    Pablo

  • Hi Pablo, 
    You can add the following: 

    Add this CMake code at the bottom of the app's CMakeLists.txt.
    
    set_property(
    TARGET zephyr_property_target
    APPEND PROPERTY TFM_CMAKE_OPTIONS
    
    # Use -O0. So GDB reports correct line numbers.
    -DCMAKE_BUILD_TYPE=Debug
    
    # Halt instead of rebooting on internal TF-M faults.
    -DTFM_HALT_ON_CORE_PANIC=ON
    
    # NB: Probably not enough MPU regions on nrf53 in Isolation level 2
    -DNULL_POINTER_EXCEPTION_DETECTION=ON
    
    # Add a debug function that logs the memory protection config, and
    # then invoke it from the SPE with log_memory_protection() to see
    # the memory protection configuration.
    -DLOG_MEMORY_PROTECTION=ON
    
    # WDT will force tests to continue for certain types of TF-M
    # tests. This can be inconvenient when debugging. But the WDT can
    # also be a necessary part of the test execution.
    # -DWATCHDOG_AVAILABLE=0
    )
    
    Add these Kconfig's to your non-secure Zephyr application's prj.conf:
    
    CONFIG_TFM_PARTITION_LOG_LEVEL_DEBUG=y
    CONFIG_TFM_SPM_LOG_LEVEL_DEBUG=y
    
    # Dump exception info. Must be combined with enabling log output.
    CONFIG_TFM_EXCEPTION_INFO_DUMP=y
    
    # Don't use the minimal TF-M configuration as that doesn't support logging
    CONFIG_TFM_PROFILE_TYPE_NOT_SET=y
    After that I can see FATAL ERROR: BusFault in the log. 
    But we still troubling to find why we have a BusFault here. 
Related