Error handling in production - saving the address of the function containing an APP_ERROR_CHECK call

Question

Hello! 
 We are trying to improve the in-production error handling for one of our products. In the past, we have always saved the LSB of the error code in the GPREGRET register, but that is not always enough information to characterize or solve complicated problems. 
 We have external eeprom and have confirmed that saving records to eeprom in our app_error_fault_handler function does not create any problems. The issue we are having is with getting the information that we need to effectively deal with the errors. 
 We know that when DEBUG=TRUE APP_ERROR_CHECK will call app_error_handler with the line number and filename magically obtained from the __LINE__ and __FILE__ macros, but when we build without DEBUG, these are not available. We would prefer to build without DEBUG=TRUE, both to keep the application size manageable and because we don't want most of the other effects of DEBUG=TRUE. 
 If we just change the definition of APP_ERROR_HANDLER so that it always calls app_error_handler instead of app_error_handler_bare it sort of gets us what we want, but the application still gets bloated with all the extra file paths. 
 #ifdef DEBUG
#define APP_ERROR_HANDLER(ERR_CODE) \
 do \
 { \
 app_error_handler((ERR_CODE), __LINE__, (uint8_t*) __FILE__); \
 } while (0)
#else
#define APP_ERROR_HANDLER(ERR_CODE) \
 do \
 { \
 app_error_handler((ERR_CODE), __LINE__, (uint8_t*) __FILE__); \
 } while (0)
#endif 
 What we would like to do is to save the address of the function containing the APP_ERROR_CHECK call so that we can look it up in the .map file when we get the log report. 
 I was hoping that we would be able to run __builtin_return_address(0) in app_error_handler_bare 
 void app_error_handler_bare(ret_code_t error_code)
{
 error_info_t error_info =
 {
 .line_num = (uint32_t)__builtin_return_address(0),
 .p_file_name = NULL,
 .err_code = error_code,
 };

 app_error_fault_handler(NRF_FAULT_ID_SDK_ERROR, 0, (uint32_t)(&error_info));

 UNUSED_VARIABLE(error_info);
} 
 to get the address of the thing that ran before it, but the address that I get doing that are all over the place. sometimes I get a different address for subsequent APP_ERROR_CHECK calls at the same location. Is there any way to reliably get the address of the function that called APP_ERROR_CHECK? 
 Relevant info: building with gcc using nrf5 sdk version 15.2 
 Thanks In advance

Hieu · Accepted Answer

Hi jrowe, 
 I find it a very interesting topic you are having here. 
 Several years ago, I also have this exact same need. I experimented with assigning a number to each application source file. I wasn't very happy about it because it felt too non-standard, not very maintainable. For the small project I had at the time, it works, but at that scale, the cost of using __FILE__ is very little anyway, so the benefit is moot... I think your idea to use the address of function to log error is a lot nicer though and should also be more scalable. 
 Firstly, I see that you are trying to use __builtin_return_address(0) to replace use of __LINE__. This doesn't save you as much space as replacing use of __FILE__. Like you have found, the file names are what consuming the memory here. Meanwhile, __LINE__ are pretty important to figure out where exactly in a function that things go wrong. 
 Next, regarding the return value of __builtin_return_address(0). That function gives you where the current function should return. So that would be an address somewhere in the caller function, not the address of the caller function itself. 
 You said the address that __builtin_return_address() returns something completely random, could you please elaborate how random it is? 
 In particular, when you match the return value against the MAP file, is the return value within the range of the caller function at all? 
 Similarly, when multiple of your app_error_handler_bare() calls were made successively at the same place, does the return value of __builtin_return_address() in the subsequent calls seem just a few addresses after the previous one? 
 I tested the theory on the UART example and got the expected result. Below is my test code and output. 
 I don't have the same environment as you do, so I went ahead with SEGGER Embedded Studio v5.42a (which supposedly uses ARM GCC for compiling) and nRF5 SDK v17.1.0. 
 // Need to increase TX buffer size to fit the test outpu
#define UART_TX_BUF_SIZE 4096

...

void a(void);
void b(uint8_t);

void a(void) {
 uint8_t i;
 b(1);

 printf("addr of a(): %p
", &a);
 printf("addr of b(): %p
", &b);
 
 for (i = 0; i < 2; i++) {
 printf("a() - loop %d
", i);
 b(2);
 b(3);
 }

}

void b(uint8_t num) {
 uint8_t i;
 void* p;
 for (i = 0; i < 2; i++) {
 printf("num = %d | __builtin_return_address(0) returns %p
", num, __builtin_return_address(0));
 printf("num = %d | __builtin_return_address(0) returns %p
", num, __builtin_return_address(0));
 }
}

int main(void) {
 ...
 a();
 ...
} 
 Output: 
 num = 1 | __builtin_return_address(0) returns 00001265
num = 1 | __builtin_return_address(0) returns 00001265
num = 1 | __builtin_return_address(0) returns 00001265
num = 1 | __builtin_return_address(0) returns 00001265
addr of a(): 0000125d
addr of b(): 00001221
a() - loop 0
num = 2 | __builtin_return_address(0) returns 00001285
num = 2 | __builtin_return_address(0) returns 00001285
num = 2 | __builtin_return_address(0) returns 00001285
num = 2 | __builtin_return_address(0) returns 00001285
num = 3 | __builtin_return_address(0) returns 0000128b
num = 3 | __builtin_return_address(0) returns 0000128b
num = 3 | __builtin_return_address(0) returns 0000128b
num = 3 | __builtin_return_address(0) returns 0000128b
a() - loop 1
num = 2 | __builtin_return_address(0) returns 00001299
num = 2 | __builtin_return_address(0) returns 00001299
num = 2 | __builtin_return_address(0) returns 00001299
num = 2 | __builtin_return_address(0) returns 00001299
num = 3 | __builtin_return_address(0) returns 000012fd
num = 3 | __builtin_return_address(0) returns 000012fd
num = 3 | __builtin_return_address(0) returns 000012fd
num = 3 | __builtin_return_address(0) returns 000012fd
 
 Hieu

Error handling in production - saving the address of the function containing an APP_ERROR_CHECK call

Top Replies