This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

NRF9160 & freeRTOS nrf_connect() always times out

Hello,
I am working on a project which requires using FreeRTOS with the nrf9160.

Setup:
- Boards tested on:actinius icarus, Thingy91)
- VS-Code with Ceedling as build tool.
- J-Link base
- bsdlib version 0.8.1
- Swisscom SIM

I ported the the Zephry example https_example to both the bare-metal and freeRTOS version.

My issue arises with the nrf_connect() API call which does always return -1, with NRF_ETIMEDOUT (60).

I am doing my developpment on two version:
- A bare-metal version where every thing works.
- A freeRTOS version which does always timeout on nrf_connect(fd, res->ai_addr, sizeof(struct nrf_sockaddr_in));

Attempting to fix this issue, I used the example https_example from the nrf sdk.
Both are identical to the Zephry version with exception of calling directly the nrf9160 bsdlib API.
The bare-metal works a treat, the freeRTOS does not.


Some things I tried:
- I used the bare-metal bsd_os.c in my freertos version, to ensure the issue does not come from there. The error is still there.
- I used a blocking only version (from the sdk porting guide template) of bsd_os.c The error is still there.
- I used the same security tag & certificates for the bare-metal and freertos versions. I can assure the certificate is correct in both applications.
- I know the network is connected, because the app is able to resolve google.com and other websites (to ensure they are not in a cache on the device) before opening the TLS.

Could freeRTOS alter configurations with are done by bsdlib, or alter timings in a significant way?
Does someone have experience with freeRTOS on nrf9160?
Do you have any hints ?

If I get this to work, I'll happily share the freeRTOS implementation.

Sincerly,
Manu

Parents
  • Hi Manuel,

    The idea, for now, is that we focus our nRF91 solutions on Zephyr RTOS. So I haven't spent any time and energy looking into nRF91 + FreeRTOS. 

    I am doing my developpment on two version:
    - A bare-metal version where every thing works.
    - A freeRTOS version which does always timeout on nrf_connect(fd, res->ai_addr, sizeof(struct nrf_sockaddr_in));

     It seems like one of the tasks in your design has a deadlock (still a guess, but timeouts in one context means that the system is still running except few contexts did not run as they should). It is hard to say that with confidence as we have not tried it ourselves.

    Could I ask, why you need to use only FreeRTOS and not Zephyr for your project? Your inputs will help us understand specific needs that we might have missed to see.

  • I might look again for deadlocks, but when I only ran 2 threads with my "bare-metal" bsd-lib.c version on blockign mode, there shoudln't be any deadlocks.

    As for Zephry, we prefered FreeRTOS because we are using AWS services and we expected those to be better integrated into FreeRTOS than Zephry. Also we have more experience with workflows based on FreeRTOS and have most of our libs/utils based on it.

  • - I used the bare-metal bsd_os.c in my freertos version, to ensure the issue does not come from there. The error is still there.
    - I used a blocking only version (from the sdk porting guide template) of bsd_os.c The error is still there.
    - I used the same security tag & certificates for the bare-metal and freertos versions. I can assure the certificate is correct in both applications.
    - I know the network is connected, because the app is able to resolve google.com and other websites (to ensure they are not in a cache on the device) before opening the TLS.

     Where did you get the baremetal version of bsd_os.c? Can you please upload it here so that I can quickly take a look at it?

    What is the contexts in which you test call the nrf_connect in baremetal application and context in which you call nrf_connect in FreeRTOS? I would debug and step into nrf_connect in disassemly to see if the code paths taken into the library(for this call only) in baremetal and freertos are similar.

     

    Could freeRTOS alter configurations with are done by bsdlib, or alter timings in a significant way?

    Yes, there is always a chance that FreeRTOS scheduler behaves differently than Zephyr RTOS scheduler and alters the timings in a significant way.

    If you can figure out

    1. contexts of call to nrf_connect in baremetal and freeRTOS app, then that would give us some differences that might help narrow down the problem.
    2. bsd_os.c, if you are 100% confident on this baremetal variant of the file, then it is fine, if not, then I would spend some time to validate its functionality. I have not seen or tested this before.

Reply
  • - I used the bare-metal bsd_os.c in my freertos version, to ensure the issue does not come from there. The error is still there.
    - I used a blocking only version (from the sdk porting guide template) of bsd_os.c The error is still there.
    - I used the same security tag & certificates for the bare-metal and freertos versions. I can assure the certificate is correct in both applications.
    - I know the network is connected, because the app is able to resolve google.com and other websites (to ensure they are not in a cache on the device) before opening the TLS.

     Where did you get the baremetal version of bsd_os.c? Can you please upload it here so that I can quickly take a look at it?

    What is the contexts in which you test call the nrf_connect in baremetal application and context in which you call nrf_connect in FreeRTOS? I would debug and step into nrf_connect in disassemly to see if the code paths taken into the library(for this call only) in baremetal and freertos are similar.

     

    Could freeRTOS alter configurations with are done by bsdlib, or alter timings in a significant way?

    Yes, there is always a chance that FreeRTOS scheduler behaves differently than Zephyr RTOS scheduler and alters the timings in a significant way.

    If you can figure out

    1. contexts of call to nrf_connect in baremetal and freeRTOS app, then that would give us some differences that might help narrow down the problem.
    2. bsd_os.c, if you are 100% confident on this baremetal variant of the file, then it is fine, if not, then I would spend some time to validate its functionality. I have not seen or tested this before.

Children
  • Sorry for my late answer, we did put this project on hold until february/march.
    I have attached the FreeRTOS bsd_lib.c and the bare-metal version which worked (bsd_os.minimal.c).
    As for the bare-metal version, I only noticied that it worked, but I am not confident that it works in every situation.

    After all it has no mechanism to unlock itself from the timeout. I only noticed that it does strangely work in this situation.

    I have not been able to work to this project.
    I will try to get back to this by march.


    // This is an independent project of an individual developer. Dear PVS-Studio, please check it.
    // PVS-Studio Static Code Analyzer for C, C++, C#, and Java: http://www.viva64.com
    #include <string.h>
    #include <errno.h>
    #include "nrf.h"
    
    #include "def.h"
    #include "FreeRTOS.h"
    #include "task.h"
    #include "queue.h"
    #include "semphr.h"
    
    #include "nrf_errno.h"
    #include "bsd_os.h"
    #include "bsd_platform.h"
    #include "bsd_limits.h"
    
    FILENUM(94441);
    
    #define BSD_OS_TRACE_IRQ          EGU2_IRQn
    #define BSD_OS_TRACE_IRQ_PRIORITY 6
    
    #define BSD_APPLICATION_IRQ       EGU1_IRQn
    
    static volatile TaskHandle_t sWaitingTaskHandleTbl[BSD_MAX_SOCKET_COUNT];
    static volatile UInt32 sWaitingTaskCount = 0;
    
    
    int32_t bsd_os_timedwait(uint32_t context, int32_t *timeout)
    {
        // See https://github.com/nrfconnect/sdk-nrf/blob/master/lib/bsdlib/bsd_os.c
        UNUSED(context);
    
        TickType_t start = pdTICKS_TO_MS(xTaskGetTickCount());
        TickType_t sleepTime;
    
        // We can't do an infinite wait
        if(*timeout == 0)
        {
            return(NRF_ETIMEDOUT);
        }
    
        if((*timeout) < 0)
        {
            // wait forwever, or at least the maximum possible.
            sleepTime = portMAX_DELAY;
        }
        else
        {
            sleepTime = pdMS_TO_TICKS(*timeout);
        }
    
    
        /*
         * Add this tag to the list of sleeping tasks.
         */
        taskENTER_CRITICAL();
        size_t index = 0;
        // Loop in the table until you find the first emtpy (NULL) space.
        for(index = 0; (index < ARRAY_SIZE(sWaitingTaskHandleTbl)) && (NULL != sWaitingTaskHandleTbl[index]);
            index++)
        {
            continue;
        }
        if(index == ARRAY_SIZE(sWaitingTaskHandleTbl))
        {
            // No space, this should never happen.
            assert(false);
            NVIC_SystemReset();
        }
        sWaitingTaskHandleTbl[index] = xTaskGetCurrentTaskHandle();
        sWaitingTaskCount += 1;
        taskEXIT_CRITICAL();
    
        // Wait for bsd_os_application_irq_clear() to wakeup this thread.
        // return NRF_ETIMEOUT or remaining time depending on timer.
        ulTaskNotifyTake(pdFAIL, sleepTime);
    
        taskENTER_CRITICAL();
        sWaitingTaskHandleTbl[index] = NULL;
        sWaitingTaskCount -= 1;
        taskEXIT_CRITICAL();
    
        if(sleepTime == portMAX_DELAY)
        {
            return(0);
        }
    
        /* Calculate how much time is left until timeout. */
        UInt32 now = pdTICKS_TO_MS(xTaskGetTickCount());
        Int64 remaining = (Int64)(*timeout) - (now - start);
    
        *timeout = (remaining > 0) ? remaining : 0;
        if(*timeout == 0)
        {
            return(NRF_ETIMEDOUT);
        }
    
        return(0);
    }
    
    
    
    /// Function required by BSD library. We need to set the EGU1 interrupt.
    void bsd_os_application_irq_set(void)
    {
        NVIC_SetPendingIRQ(BSD_APPLICATION_IRQ);
    }
    
    /// Function required by BSD library. We need to clear the EGU1 interrupt.
    void bsd_os_application_irq_clear(void)
    {
        NVIC_ClearPendingIRQ(BSD_APPLICATION_IRQ);
    }
    
    /// Function required by BSD library. We need to set the EGU2 interrupt.
    void bsd_os_trace_irq_set(void)
    {
        NVIC_SetPendingIRQ(BSD_OS_TRACE_IRQ);
    }
    
    /// Function required by BSD library. We need to clear the EGU2 interrupt.
    void bsd_os_trace_irq_clear(void)
    {
        NVIC_ClearPendingIRQ(BSD_OS_TRACE_IRQ);
    }
    
    void read_task_create(void)
    {
        // The read task is achieved using SW interrupt.
        NVIC_SetPriority(BSD_APPLICATION_IRQ, BSD_APPLICATION_IRQ_PRIORITY);
        NVIC_ClearPendingIRQ(BSD_APPLICATION_IRQ);
        NVIC_EnableIRQ(BSD_APPLICATION_IRQ);
    }
    
    void trace_task_create(void)
    {
        NVIC_SetPriority(BSD_OS_TRACE_IRQ, BSD_OS_TRACE_IRQ_PRIORITY);
        NVIC_ClearPendingIRQ(BSD_OS_TRACE_IRQ);
        NVIC_EnableIRQ(BSD_OS_TRACE_IRQ);
    }
    
    /// Function required by BSD library.
    void bsd_os_init(void)
    {
        taskENTER_CRITICAL();
        memset((UInt8 *)sWaitingTaskHandleTbl, 0, sizeof(sWaitingTaskHandleTbl));
        sWaitingTaskCount = 0;
        taskEXIT_CRITICAL();
    
        read_task_create();
        trace_task_create();
    }
    
    
    /// Function required by BSD library
    int32_t bsd_os_trace_put(const uint8_t *const data, uint32_t len)
    {
        UNUSED(data);
        UNUSED(len);
        return(0);
    }
    
    
    /// Function required by BSD library. Stores an error code we can read later.
    void bsd_os_errno_set(int err_code)
    {
        // Translate nrf_errno.h errno to the OS specific value.
        errno = err_code;
    }
    
    
    void EGU1_IRQHandler(void)
    {
        bsd_os_application_irq_handler();
    
        // Wakup sleeping/timeout threads.
        BaseType_t xHigherPriorityTaskWoken;
    
        UInt32 sleeperNotified = 0;
        size_t i = 0;
        while(sleeperNotified < sWaitingTaskCount)
        {
            assert(i < ARRAY_SIZE(sWaitingTaskHandleTbl));
            xHigherPriorityTaskWoken = pdFALSE;
            TaskHandle_t hdl = sWaitingTaskHandleTbl[i];
            if(hdl != null)
            {
                vTaskNotifyGiveFromISR(hdl, &xHigherPriorityTaskWoken);
                sleeperNotified += 1;
            }
            i += 1;
        }
    
        portYIELD_FROM_ISR(xHigherPriorityTaskWoken);
    }
    
    void EGU2_IRQHandler(void)
    {
        bsd_os_trace_irq_handler();
    }
    

    // This is an independent project of an individual developer. Dear PVS-Studio, please check it.
    // PVS-Studio Static Code Analyzer for C, C++, C#, and Java: http://www.viva64.com
    #include <string.h>
    #include <errno.h>
    #include "nrf.h"
    
    #include "def.h"
    #include "xfTimeout.h"
    #include "nrf_errno.h"
    #include "bsd_os.h"
    #include "bsd_platform.h"
    
    FILENUM(94441);
    
    #define BSD_OS_TRACE_IRQ          EGU2_IRQn
    #define BSD_OS_TRACE_IRQ_PRIORITY 6
    
    #define BSD_APPLICATION_IRQ       EGU1_IRQn
    
    int32_t bsd_os_timedwait(uint32_t context, int32_t *timeout)
    {
        UNUSED(context);
        // We can't do an infinite wait
        if(*timeout < 0)
        {
            return(0);
        }
    
        Int16 timeoutHandle = xfTimeout_CreateTimeout(*timeout);
        assert(timeoutHandle >= 0);
    
        while(false == xfTimeout_HasTimedOut(timeoutHandle))
        {
            __asm("nop");
        }
        xfTimeout_ReleaseTimeout(timeoutHandle);
    
        return(NRF_ETIMEDOUT);
    
        //return(0);
    }
    
    
    
    /// Function required by BSD library. We need to set the EGU1 interrupt.
    void bsd_os_application_irq_set(void)
    {
        NVIC_SetPendingIRQ(BSD_APPLICATION_IRQ);
    }
    
    /// Function required by BSD library. We need to clear the EGU1 interrupt.
    void bsd_os_application_irq_clear(void)
    {
        NVIC_ClearPendingIRQ(BSD_APPLICATION_IRQ);
    }
    
    /// Function required by BSD library. We need to set the EGU2 interrupt.
    void bsd_os_trace_irq_set(void)
    {
        NVIC_SetPendingIRQ(BSD_OS_TRACE_IRQ);
    }
    
    /// Function required by BSD library. We need to clear the EGU2 interrupt.
    void bsd_os_trace_irq_clear(void)
    {
        NVIC_ClearPendingIRQ(BSD_OS_TRACE_IRQ);
    }
    
    void read_task_create(void)
    {
        // The read task is achieved using SW interrupt.
        NVIC_SetPriority(BSD_APPLICATION_IRQ, BSD_APPLICATION_IRQ_PRIORITY);
        NVIC_ClearPendingIRQ(BSD_APPLICATION_IRQ);
        NVIC_EnableIRQ(BSD_APPLICATION_IRQ);
    }
    
    void trace_task_create(void)
    {
        NVIC_SetPriority(BSD_OS_TRACE_IRQ, BSD_OS_TRACE_IRQ_PRIORITY);
        NVIC_ClearPendingIRQ(BSD_OS_TRACE_IRQ);
        NVIC_EnableIRQ(BSD_OS_TRACE_IRQ);
    }
    
    /// Function required by BSD library.
    void bsd_os_init(void)
    {
        read_task_create();
        trace_task_create();
    }
    
    
    
    /// Function required by BSD library
    int32_t bsd_os_trace_put(const uint8_t *const data, uint32_t len)
    {
        UNUSED(data);
        UNUSED(len);
        return(0);
    }
    
    
    /// Function required by BSD library. Stores an error code we can read later.
    void bsd_os_errno_set(int err_code)
    {
        // Translate nrf_errno.h errno to the OS specific value.
        errno = err_code;
    }
    
    
    // void IPC_IRQHandler(void)
    // {
    // }
    
    void EGU1_IRQHandler(void)
    {
        bsd_os_application_irq_handler();
    }
    
    void EGU2_IRQHandler(void)
    {
        bsd_os_trace_irq_handler();
    }
    

    As for the nrf_connect context, it is called by this code.

    static TransportStatus_t prv_OpenTLSSocket(NetworkContext_t *pNetworkContext)
    {
        int err;
        /* Security tag that we have provisioned the certificate with */
        const nrf_sec_tag_t tls_sec_tag[] = {
            pNetworkContext->TLSSecureTag
        };
    
        /* Set up TLS peer verification */
        enum {
            NONE = 0,
            OPTIONAL = 1,
            REQUIRED = 2,
        };
        nrf_sec_peer_verify_t verify = (nrf_sec_peer_verify_t)REQUIRED;
    
    
        int sock = nrf_socket(NRF_AF_INET, NRF_SOCK_STREAM, NRF_SPROTO_TLS1v2);
        if(sock == -1)
        {
            logWarn("Failed to create socket, err %d\n", errno);
            return(NRF_TRANSPORTSTATUS_CONNECT_FAILURE);
        }
    
        err = nrf_setsockopt(sock, NRF_SOL_SECURE, NRF_SO_SEC_PEER_VERIFY, &verify, sizeof(verify));
        if(err)
        {
            logError("Failed to setup peer verification, err %d\n", errno);
            nrf_close(sock);
            return(NRF_TRANSPORTSTATUS_INVALID_PARAMETER);
        }
    
        // err = nrf_setsockopt(sock, NRF_SOL_SECURE, NRF_SO_HOSTNAME, pNetworkContext->Url, strlen(pNetworkContext->Url));
        // if(err)
        // {
        //     logError("Failed to setup TLS hostname, err %d\n", errno);
        //     nrf_close(sock);
        //     return(NRF_TRANSPORTSTATUS_INVALID_PARAMETER);
        // }
        struct nrf_timeval nrf_timeo;
        nrf_timeo.tv_sec = 10;
        nrf_timeo.tv_usec = 0;
        err = nrf_setsockopt(sock, NRF_SOL_SECURE, NRF_SO_RCVTIMEO, &nrf_timeo, sizeof(struct nrf_timeval));
        if(err)
        {
            logError("Failed to setup RCVTTIMEO hostname, err %d\n", errno);
            nrf_close(sock);
            return(NRF_TRANSPORTSTATUS_INVALID_PARAMETER);
        }
    
        /*
         * Associate the socket with the security tag
         * we have provisioned the certificate with.
         */
        err = nrf_setsockopt(sock, NRF_SOL_SECURE, NRF_SO_SEC_TAG_LIST, tls_sec_tag, sizeof(tls_sec_tag));
        if(err)
        {
            logWarn("Failed to setup TLS sec tag, err %d\n", errno);
            nrf_close(sock);
            return(NRF_TRANSPORTSTATUS_INVALID_PARAMETER);
        }
    
        char dest[64];
        nrf_inet_ntop(NRF_AF_INET, (struct nrf_sockaddr *)&(pNetworkContext->ServerAddr.sin_addr), dest, 64);
    
        /*
         * Connect
         */
        logInfo("Opening TLS socket to cloud on ip: %s", dest);
        err = nrf_connect(sock, &(pNetworkContext->ServerAddr), sizeof(struct nrf_sockaddr_in));
        if(err)
        {
    
            logError("nrf_connect() over TLS failed, err: %d", errno);
            nrf_close(sock);
            return(NRF_TRANSPORTSTATUS_INVALID_PARAMETER);
        }
    
        pNetworkContext->Socket = sock;
    
        return(NRF_TRANSPORTSTATUS_SUCCESS);
    }
    

    I did not try the newer versions of the modem firmware, which seems to fix some things related to TCP.


    Until then, thank you for your time. 
    I hope to find some time before to narrow down the problem.

Related