FPU performances calculation - optimizing real time math

Hi

I am using nRF52832, S132 / SDK17, implementing an algorithm requiring some math,

for example I am doing a matrix multiplication with about 800 float multiplications, I understand that a multiplication taking 3 cycles from the ARM-M4, working with 32MHz; and optimizing for time, I am seeing it 2400 cycles to take more than 200us -Does that make sense?

is there some way (not algorithmically that is) to improve those performances? some other optimizing flag to be raised, FPU enableing? a way to allocate the memory to be more efficiently accessed? 

Is there some example/reference you can refer me to?

Thanks!

Parents
  • Hi

    What IDE are you using for development? In SEGGER Embedded Studios you have optimization levels that you can choose based on what your application needs: 

    I haven't done the math, but optimization level 3 (if you have room for it) should provide the highest possible speed for your application AFAIK. FPU should also be enabled by default in most of our SDK v17.1.0 examples.

    Best regards,

    Simon

Reply
  • Hi

    What IDE are you using for development? In SEGGER Embedded Studios you have optimization levels that you can choose based on what your application needs: 

    I haven't done the math, but optimization level 3 (if you have room for it) should provide the highest possible speed for your application AFAIK. FPU should also be enabled by default in most of our SDK v17.1.0 examples.

    Best regards,

    Simon

Children
Related