This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

profiling on a nrf51422

Hi, have some of you expierence with profiling on a nrf51422? I think what would be nessesary, is at least one high priority timer, to sample the program counter, some kind of channel to get the data out of the µcontroller (ant for example) and something analyse the data off target, like google perftools. In addition it would be very helpfull to get the top of the callstack (just the function backtrace) for analysis of the cumulative cpu usage of some functions.

I wonder if it would be doable in conjuction with a softdevice. Especialy the high priority timer part of idea.

cheers, Torsten

Top Replies

0 RK over 10 years ago

The CrossWorks studio I've been using has something like this. In the absence of a profile/trace on the chip (which the nrf51822 doesn't have) it uses timed sampling to gather information. It's a little crude and you need to run through a load of cycles of your code to really build up a picture, but it works.

That's external to the device however, it's CrossWorks itself, running on the laptop which is interrogating the debug interface on a constant basis to grab the current PC and then mapping it later. I suspect trying what you suggest there and trying to do it all on-chip isn't going to work too well. The cost of taking the samples, unless you are sampling very rarely, is going to affect the entire timing of the rest of your code to the point I rather doubt your results will be very worthwhile.

I'd look for something external which uses the debug interface to gather the data off-chip if you can find one, a bit of work for sure but should give you better results.
Cancel
Vote Up +1 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Nenik over 10 years ago
When I needed to optimize some of my code, I did two things:

Selected 2 GPIOs for LEDs, toggled the LEDs at key milestones of the code. One line was kind of synchronization (start of the repeated computation), the other was progress. I have then used scope to analyze the time spent in individual parts.

I wrote a "test suite" for unit testing that also does timing. Basically, I type "make test" and it rebuilds the "test-main.c" against all other sources, flashes it to the device (where it runs all the tests and stores the results in a binary log), waits some time, pull back the memory with the binary log and parse the log. This not only helped me tremendously when implementing complex computations, but it also allowed me to record the timing of the building blocks of those computations.
Cancel
Vote Up +1 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Torsten Robitzki over 10 years ago in reply to RK

Taking the PC via JTag seems to be a very good idea. But I think that it will be hard to take the callstack. Maybe it's possible to get relevant portions of the whole stack and to extract the callstack from that of target. I like PC sampling very much, because it's easy to implement and close to non-intrusive if you use a very low sample rate. If you have an application that you can keep under typcal load, you just need more time if you lower the sample rate :-)
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Torsten Robitzki over 10 years ago in reply to Nenik

Hi Nenik, I didn't understood how you use the LEDs. Some kind of pulse width modulation and the brightness of the LEDs give you an impression on the performance?
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Nenik over 10 years ago in reply to Nenik

The LEDs were just to have visual feedback (when the outer testing loop was in hundreds of ms). The LEDs are optional, I have used the oscilloscope to actually measure the time of individual "milestones" in the code flow. Trigger the scope (channel 1) on the the synchronization pulse (generated at the start of the code you profile), watch the toggles on the channel 2, use the scope cursor to measure the time.

You can adapt that approach to different situations. For the sake of example, imagine you'd need to sort a list. You'd toggle ch1 on start/end of sort (so the total sort time would be width of ch1 pulse), then e.g. toggle ch2 around your comparison function. Under this arrangement, you'd see: How many times the comparison run, how much time it spent in each comparison, and what was the overall proportion of other code vs. the comparison function (as duty cycle).
Cancel
Vote Up +1 Vote Down

Sign in to reply

Verify Answer

Cancel