This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

nRF52 FP exception conundrum

Apologies in advance for what could be a rudimentary question. I have a large nRF52832-based project that only uses floating point in a few small but important places. Although pre-production, have a decent sized batch of units in burn-in that work well.

My issue is that I have continuing issues related to certain units that take unexpected FP faults on harmless FP operations such as loads and comparisons. I have spent innumerable hours on this and months ago added code that uses fpclassify to examine any/every FP value being brought in from an external source (sensor), to ensure that I am only operating on valid data. I am confident of this.

This is difficult for me to debug because these are test units operating in a different country, but i have seen the failure pattern before and the timing indicates that this indeed is another occurrence of an FP fault.

I fully admit that I do not understand the mechanisms of FP faults - enabling them, disabling them, or why they occur. This is not a case of /0 or sqrt(neg). Just to give you a flavor of what I'm trying to contend with, the last time I had this issue some months back and was able to reproduce it in a specific situation, adding this exact code at the top of my app_sched handler completely eliminated that specific FP fault:

{ volatile float f = 0.0; isless(f, 0.0); }

The fact that this did anything at all is mind-blowing to me and only showed my lack of understanding of how the FP unit works. In any case, the above absolutely fixed the repro that I had, but now that I have another intermittent failure in the field, it is time to try to understand what may be happening.

In case someone asks (because someone has done so already), let me say that a) these are the relevant compiler options: '-mcpu=cortex-m4 -mfloat-abi=hard -mfpu=fpv4-sp-d16 -mthumb -mabi=aapcs', and b) yes of course i have the following in my power_manage loop:

#define FPU_EXCEPTION_MASK 0x0000009F
__set_FPSCR(__get_FPSCR()  & ~(FPU_EXCEPTION_MASK));
(void) __get_FPSCR();
NVIC_ClearPendingIRQ(FPU_IRQn);

Any thoughts, pointers, or explanations would be greatly appreciated. I am not blocked - again, this is an intermittent field failure - however, even understanding how this all works and why my "isless()" had such a dramatically positive effect might help me to reduce the issue.

Thanks in advance.

Parents
  • Apparently no community thoughts in this area. Any official thoughts?

    Alternatively, is there a way to completely disable FP traps, as a brute force workaround?

  • Thank you. I am trying. At this moment my challenge is that there is no repro, and it is occurring intermittently in the field.

    By code inspection and reconfiguration I have determined that it is somehow related to some 64-bit (double) operations that are occurring just prior to the 32-bit (float) operations that are trapping.

    Without the double operations, there is no crash. With the double operations having been performed, the subsequent (and completely unrelated) float operations take a trap.

Reply
  • Thank you. I am trying. At this moment my challenge is that there is no repro, and it is occurring intermittently in the field.

    By code inspection and reconfiguration I have determined that it is somehow related to some 64-bit (double) operations that are occurring just prior to the 32-bit (float) operations that are trapping.

    Without the double operations, there is no crash. With the double operations having been performed, the subsequent (and completely unrelated) float operations take a trap.

Children
No Data
Related