I would like to create a RTC for stamping log messages, with 1ms resolution and including time from year to millisecond, like: YYYY-MM-DD HH:MM:SS.mmm.

What would be the best solution?

The 32kHz clock is not divisible by 1000, which means that some special handling is needed for accurate results.

I just wanted to point out some errors in the math here. The RTC is 32768Hz, which is the standard since it is divisible by 2^n. So, to get millisecond accuracy you won't be dividing by 1000. All you need to do here is divide by 32 to get 1/1024 of a second accuracy. Since this is better than millisecond accuracy you only need to do some floating point math and round up to get a millisecond accurate time stamp.
Because of the way the prescaler is implemented on nRF it might be easiest to run the prescaler at 0 for systick of 32768Hz and then use one of the compare registers to generate software events every 1/1024 of a second.
Similarly, you can just set a compare register to run freely and overflow every 32768 and capture the value of the compare register whenever your event occurs. Then floating point math gets you to your timestamp again.