This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Instruction cache (I-Cache) using and user guide.

Hi.

I do not find the description about I-Cache principles and work algorithm. 

I read the part of the documentation "NVMC - Non-volatile memory controller". I think, I already activated it with instruction "NRF_NVMC->ICACHECNF = 1". I see the performance changing (I am running the Coremark test), but I don't place any function in RAM region.

I still don't understand how the cache works. The information about the module is very scarce and insufficient. I do not find any examples, application notes or description about the cache profiling. 

Can you point me something helpful?

Parents
  • Not an official answer, but maybe some useful tips. The cache has a private SRAM which is inaccessible by the user, likely 2k bytes. The NVMC must be ready before enabling the cache:

     // Enable cache and also enable hit/miss tracking
     while(NRF_NVMC->READY == 0) ;
     NRF_NVMC->ICACHECNF = 0x101;

    Here I have enabled the hit/miss tracking counters as well as the cache, which are 2 non-resettable saturating counters which increment on every cache hit and miss. As the code executes it fills the cache (cache miss count++) until parts of the code in the cache execute from the private SRAM instead of FLASH (cache hit count++). If the code loop is bigger than the cache, then the cache starts reloading and (cache miss count++). If the code is small enough to fit in the private SRAM then (cache hit count++), ie optimum performance. However, if some interrupt then occurs then that interrupt will load the cache with new FLASH code as the interrupt executes  (cache miss count++). If the total size of active interrupts and code loop fits in the cache, then back to maximum performance and no cache misses, but that is unlikely unless most interrupts are disabled for the duration of executing the test, which must be small and not call large functions. Executing Coremark tests via function calls might exceed the size of the cache, so maybe write some other tight loop and measure performance that way

    Once the cache hit/miss registers reach max values there is no more useful information available without a reset. To display use something like this:

     NRF_LOG_INFO("Cache Enabled 0x%08X 0x%08X 0x%08X 0x%08X", NRF_NVMC->ICACHECNF, NRF_NVMC->IHIT, NRF_NVMC->IMISS, NRF_NVMC->CONFIG);
     NRF_LOG_FLUSH();
    

    Cache SRAM consumes additional power, so when cache is enabled the power consumption goes up; however, FLASH code execution uses 2 wait states whereas SRAM code execution does not use wait states, so the loop code executes faster which can and usually does lead to overall lower power consumption as well as faster execution.

Reply
  • Not an official answer, but maybe some useful tips. The cache has a private SRAM which is inaccessible by the user, likely 2k bytes. The NVMC must be ready before enabling the cache:

     // Enable cache and also enable hit/miss tracking
     while(NRF_NVMC->READY == 0) ;
     NRF_NVMC->ICACHECNF = 0x101;

    Here I have enabled the hit/miss tracking counters as well as the cache, which are 2 non-resettable saturating counters which increment on every cache hit and miss. As the code executes it fills the cache (cache miss count++) until parts of the code in the cache execute from the private SRAM instead of FLASH (cache hit count++). If the code loop is bigger than the cache, then the cache starts reloading and (cache miss count++). If the code is small enough to fit in the private SRAM then (cache hit count++), ie optimum performance. However, if some interrupt then occurs then that interrupt will load the cache with new FLASH code as the interrupt executes  (cache miss count++). If the total size of active interrupts and code loop fits in the cache, then back to maximum performance and no cache misses, but that is unlikely unless most interrupts are disabled for the duration of executing the test, which must be small and not call large functions. Executing Coremark tests via function calls might exceed the size of the cache, so maybe write some other tight loop and measure performance that way

    Once the cache hit/miss registers reach max values there is no more useful information available without a reset. To display use something like this:

     NRF_LOG_INFO("Cache Enabled 0x%08X 0x%08X 0x%08X 0x%08X", NRF_NVMC->ICACHECNF, NRF_NVMC->IHIT, NRF_NVMC->IMISS, NRF_NVMC->CONFIG);
     NRF_LOG_FLUSH();
    

    Cache SRAM consumes additional power, so when cache is enabled the power consumption goes up; however, FLASH code execution uses 2 wait states whereas SRAM code execution does not use wait states, so the loop code executes faster which can and usually does lead to overall lower power consumption as well as faster execution.

Children
Related