nRF Cloud Log backend (REST API) not pushing logs to cloud

I'm trying to configure the nRF cloud log backend to aid with debugging of devices during field tests without the need for bulky UART harnesses to capture logs from the serial output of the board. Relevant prj.conf excerpts below:

```
CONFIG_LOG=y
CONFIG_LOG_PROCESS_THREAD=y
CONFIG_LOG_PROCESS_THREAD_STACK_SIZE=4096
CONFIG_LOG_BUFFER_SIZE=1024
CONFIG_LOG_PROCESS_THREAD_SLEEP_MS=10000
CONFIG_LOG_PROCESS_TRIGGER_THRESHOLD=5
CONFIG_LOG_DEFAULT_LEVEL=3
CONFIG_NRF_CLOUD=y
CONFIG_NRF_CLOUD_REST=y
CONFIG_NRF_CLOUD_REST_AUTOGEN_JWT=y
CONFIG_LOG_BACKEND_NRF_CLOUD_OUTPUT_TEXT=y
CONFIG_NRF_CLOUD_LOG_DIRECT=y
CONFIG_NRF_CLOUD_LOG_BACKEND=y
CONFIG_NRF_CLOUD_LOG_OUTPUT_LEVEL=3
CONFIG_NRF_CLOUD_LOG_LOG_LEVEL_DBG=y
CONFIG_NRF_CLOUD_LOG_BUF_SIZE=1024
CONFIG_LOG_MODE_DEFERRED=y
```

At the very start of main.c, a buffer for REST responses is allocated, and a rest context passed to the cloud log backend:
```

int main(void) {
  boot_write_img_confirmed();
  uint8_t *rx_buf = calloc(2048, sizeof(uint8_t));
  struct nrf_cloud_rest_context ctx = {
      .connect_socket = -1,
      .keep_alive = true,
      .timeout_ms = 10000,
      .auth = NULL,
      .rx_buf = rx_buf,
      .rx_buf_len = 2048,
  };

  nrf_cloud_log_rest_context_set(&ctx, "LOG_TEST");
```

This configuration appears to have some effect, as the format of log lines on the UART output are changed after enabling the cloud log backend, however no data is actually pushed to the cloud for this device. Besides log line format changes, the only indication that the cloud logger is present is occasional debug output between application logs:
`[00:00:24.045,135] <dbg> nrf_cloud_log_backend: logger_notify: Buffered lines:0, bytes:0; logged lines:0, bytes:0; sent lines:0, bytes:0; dropped lines:0`

This is the only message received from the nrf_cloud_log_backend module, and it never indicates any buffered or sent data. I know this device can use the nRF Cloud API, as we're using the LTE location API without issue.

Please let me know how I should adjust our firmware to allow logs to be properly pushed to the cloud
Kaitlyn

Parents
  • Hi,

    It looks that your backend is present and running however not enabled. For REST-based nRF Cloud logging, the log subsystem is not automatically enabled at runtime. So could you please try explicitly initializing and enabling it after LTE is connected and date time library is ready (nRF Cloud Log API):

    nrf_cloud_log_init();
    nrf_cloud_log_enable(true);
    nrf_cloud_log_level_set(3);   /* or 4 for DBG */

    Also, please ensure the device ID passed to nrf_cloud_log_rest_context_set() matches the actual onboarded device ID in nRF Cloud. Let us know if you start seeing any sent data. If not, please share the NCS version you are using along with a minimal reproducible source code snippet and prj.conf, so we can test this on our side.

    Best Regards,
    Syed Maysum

Reply
  • Hi,

    It looks that your backend is present and running however not enabled. For REST-based nRF Cloud logging, the log subsystem is not automatically enabled at runtime. So could you please try explicitly initializing and enabling it after LTE is connected and date time library is ready (nRF Cloud Log API):

    nrf_cloud_log_init();
    nrf_cloud_log_enable(true);
    nrf_cloud_log_level_set(3);   /* or 4 for DBG */

    Also, please ensure the device ID passed to nrf_cloud_log_rest_context_set() matches the actual onboarded device ID in nRF Cloud. Let us know if you start seeing any sent data. If not, please share the NCS version you are using along with a minimal reproducible source code snippet and prj.conf, so we can test this on our side.

    Best Regards,
    Syed Maysum

Children
  • Hi Syed,

    Thank you. This is what I was looking for, enabling the cloud logging with that has gotten it pushing to the cloud dashboard successfully.

    One problem I'm noticing is that the logging backend seems rather fragile. Sometimes it will work fine, other times, on the same device/same firmware, it will encounter some error, or need to drop messages, and after that point log output will be cut off both over UART and over the cloud (No new data is displayed).

    I found that increasing buffer sizes helped with this, setting CONFIG_LOG_BUFFER_SIZE=2048 and CONFIG_NRF_CLOUD_LOG_RING_BUF_SIZE=4096 got it working for longer, but it still ends up dying unacceptably often. Is there some other solution that would improve this behaviour? Would moving to logging over MQTT help?

    For reference, this is what the UART output looks like around the time the logging backend dies:

    (Logs coming from our firmware, content irrelevant)
    [00:00:46.178,131] <inf> ...
    [00:00:46.179,992] <inf> ...

    [00:00:46.183,074] <inf> ...
    [00:00:52.178,100] <inf> ...

    --- 16 messages dropped ---


    (No messages received, let the device continue running for a couple minutes after this occurred. Expected log output is ~2-3 lines each second)

    Cloud logs also trail off prior to this occurring.

    Thank you

  • Hi,

    Thanks for the update. The “messages dropped” message means logs are being produced faster than the logging system can send them. With REST-based cloud logging, this can happen if uploads are slow or the logging thread does not run often enough, and cloud logs will usually stop before UART output. One thing to check is CONFIG_LOG_PROCESS_THREAD_SLEEP_MS. Can you try reducing it to 100ms (or similar smaller value):

    CONFIG_LOG_PROCESS_THREAD_SLEEP_MS=100

    If the issue persists, please share your NCS version and the logging related Kconfigs.

    Best Regards,
    Syed Maysum

  • Hi Syed,

    Adjusted the kconfigs as you suggested, and have gotten it to the point where the board is boot looping due to a crash in strlen originating in the logging thread. I can get the full backtrace from the debugger if that would be helpful

    Logging related kconfigs:

    CONFIG_LOG=y
    CONFIG_LOG_PROCESS_THREAD=y
    CONFIG_LOG_PROCESS_THREAD_STACK_SIZE=6144
    CONFIG_LOG_BUFFER_SIZE=4096
    CONFIG_LOG_PROCESS_THREAD_SLEEP_MS=100
    CONFIG_LOG_PROCESS_TRIGGER_THRESHOLD=5
    CONFIG_LOG_DEFAULT_LEVEL=3
    CONFIG_NRF_CLOUD_CLIENT_ID_SRC_IMEI=y
    CONFIG_NRF_CLOUD_REST=y
    CONFIG_NRF_CLOUD_REST_AUTOGEN_JWT=y
    CONFIG_NRF_CLOUD_LOG_DIRECT=y
    CONFIG_NRF_CLOUD_LOG_BACKEND=y
    CONFIG_LOG_BACKEND_NRF_CLOUD_OUTPUT_TEXT=y
    CONFIG_NRF_CLOUD_LOG_OUTPUT_LEVEL=3
    CONFIG_NRF_CLOUD_LOG_RING_BUF_SIZE=4096
    CONFIG_LOG_MODE_DEFERRED=y


    Initialization code (This happens immediately after the device registers on a cellular network)
            struct nrf_cloud_rest_context ctx = {
                .connect_socket = -1,
                .keep_alive = true,
                .timeout_ms = 10000,
                .auth = NULL,
                .rx_buf = rx_buf,
                .rx_buf_len = 2048,
            };
    
            char imei[128];
            char device_id[128];
            modem_info_string_get(MODEM_INFO_IMEI, imei, sizeof(imei));
              LOG_INF("Got imei %s", imei);
            snprintf(device_id, sizeof(device_id), "nrf-%s", imei);
            nrf_cloud_log_rest_context_set(&ctx, device_id);
    
            nrf_cloud_log_init();
            nrf_cloud_log_enable(true);
            nrf_cloud_log_level_set(3);
    



    The kernel panic message originating in the logging thread

    [00:00:11.516,662] <err> nrf_cloud_log_backend: Error -12 ret 0 processing ring buffer
    [00:00:11.516,662] <inf> nrf_cloud_log_backend: Sent lines:4, bytes:490
    [00:00:11.582,672] <err> os: ***** SECURE FAULT *****
    [00:00:11.582,702] <err> os:   Address: 0x1
    [00:00:11.582,702] <err> os:   Attribution unit violation
    [00:00:11.582,733] <err> os: r0/a1:  0x00000001  r1/a2:  0x20018e14  r2/a3:  0x00000000
    [00:00:11.582,763] <err> os: r3/a4:  0x00000001 r12/ip:  0x0000024c r14/lr:  0x0002e0f5
    [00:00:11.582,794] <err> os:  xpsr:  0x81000000
    [00:00:11.582,824] <err> os: s[ 0]:  0x40015000  s[ 1]:  0x00000000  s[ 2]:  0x0005c220  s[ 3]:  0x00000000
    [00:00:11.582,824] <err> os: s[ 4]:  0x00000000  s[ 5]:  0x40015000  s[ 6]:  0x00000000  s[ 7]:  0x00026f69
    [00:00:11.582,855] <err> os: s[ 8]:  0x00000000  s[ 9]:  0x00026ead  s[10]:  0x000434bc  s[11]:  0x00000020
    [00:00:11.582,885] <err> os: s[12]:  0x000370a1  s[13]:  0x20013554  s[14]:  0x00000000  s[15]:  0x00000000
    [00:00:11.582,885] <err> os: fpscr:  0x00000008
    [00:00:11.582,916] <err> os: Faulting instruction address (r15/pc): 0x00018d42
    [00:00:11.582,946] <err> os: >>> ZEPHYR FATAL ERROR 41: Unknown error on CPU 0
    [00:00:11.582,977] <err> os: Current thread: 0x2000d6d8 (unknown)
    [00:00:11.709,259] <err> fatal_error: Resetting system

  • Hi,

    Thanks for your logs. The secure fault in strlen() from the logging thread suggests that an invalid string pointer is being accessed. In the snippet you shared, both device_id and the REST context ctx are defined as local (stack) variables. Since the cloud log backend runs asynchronously in its own thread, it may continue to use these pointers after the initialization code has completed, at which point the memory is no longer valid. 

    Could you please try making the REST context and the device_id buffer persistent (for example static or global) and then retest? For example:

    static uint8_t rx_buf[2048];
    static struct nrf_cloud_rest_context ctx = {
        .connect_socket = -1,
        .keep_alive = true,
        .timeout_ms = 10000,
        .auth = NULL,
        .rx_buf = rx_buf,
        .rx_buf_len = sizeof(rx_buf),
    };
    
    static char device_id[128];
    
    void cloud_log_setup(void)
    {
        char imei[32] = {0};
    
        modem_info_string_get(MODEM_INFO_IMEI, imei, sizeof(imei));
        snprintk(device_id, sizeof(device_id), "nrf-%s", imei);
    
        nrf_cloud_log_rest_context_set(&ctx, device_id);
        nrf_cloud_log_init();
        nrf_cloud_log_enable(true);
        nrf_cloud_log_level_set(3);
    }

    If the issue persists after this change, please let us know the NCS version you’re using and share the back trace so we can investigate further.

    Best regards,
    Syed Maysum

  • Thank you, I think this was the final piece of the puzzle, the system seems to be stable and working now. Will report back if I run into any more issues

Related