This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Client without Notifications bricking device

Hey everyone, I have a device that's in production and everything is working great, however we have a customer that is creating their own software to interface with our sensor and they bricked a couple units because they did not turn notifications on. This causes the unit to seemingly  wait indefinitely waiting to get the thumbs up from the client that it received the packet, which never comes. We don't expect the unit to work without notifications on, but I don't want it to lock up like this... we have no hardware reset capability so the units are junk now.

I figure there has to be an event for a time out or the like that I could use to force a disconnect and start advertising again, I just can't seem to find it. I've tied BLE_GATTC_EVT_TIMEOUT, BLE_GATTS_EVT_TIMEOUT, and BLE_GAP_EVT_TIMEOUT. None of these seem to be triggered by the sensor not being able to send a packet.

Any thoughts?

Thanks,

Adam

Parents
  • Hello,

    What do you mean by "bricking device"? 

     

    This causes the unit to seemingly  wait indefinitely waiting to get the thumbs up from the client that it received the packet, which never comes.

     What device is waiting? Central or peripheral? And is this "thumbs up" supposed to be delivered via a notification, perhaps?

    Have you/they tried to debug the "bricked device" while this is happening? Does the log say anything?

    My suspicion is that the device is trying to send a notification, but this isn't possible since it isn't enabled by the central(client). But you/they would need to debug to see why the thumbs up is not working. Remember that we don't know anything about the logic in your application.

    Best regards,

    Edvin

  • This is a production device that a customer is having this issue with when trying to integrate into their software ecosystem, no debug available. Like I said, there's no hardware reset, and the device disappears.

    "My suspicion is that the device is trying to send a notification, but this isn't possible since it isn't enabled by the central(client)."

    Yes, exactly what's happening. I need to detect when this happens, and instead of sitting there waiting forever, I need to disconnect and advertise. I just can't find the notification that says the client hasn't responded or whatever to trigger the disconnect and advertise.

  • Alright, I'm totally lost... this sdk is so hard to follow for hardware guy like me.

    Right now my "on_write" function is:

    static void on_write(ble_ctcws_t * p_ctcws, ble_evt_t const * p_ble_evt)
    {
        ble_gatts_evt_write_t const * p_evt_write = &p_ble_evt->evt.gatts_evt.params.write;
    
        //take care of a write to config
        if (p_evt_write->handle == p_ctcws->config_char_handles.value_handle)
        {
            p_ctcws->config_write_handler(p_ctcws, p_evt_write->data, p_evt_write->len);
        }
    
        //take care of a write to samples
        else if (p_evt_write->handle == p_ctcws->samples_char_handles.value_handle)
        {
            p_ctcws->samples_write_handler(p_ctcws, p_evt_write->data, p_evt_write->len);
        }
    
        //take care of a write to status
        else if (p_evt_write->handle == p_ctcws->status_char_handles.value_handle)
        {
            p_ctcws->status_write_handler(p_ctcws, p_evt_write->data, p_evt_write->len);
        }
    }

    One of those (status) is one that I need to check notifications on, would I just add the cccd stuff like so (line 22+):

    static void on_write(ble_ctcws_t * p_ctcws, ble_evt_t const * p_ble_evt)
    {
        ble_gatts_evt_write_t const * p_evt_write = &p_ble_evt->evt.gatts_evt.params.write;
    
        //take care of a write to config
        if (p_evt_write->handle == p_ctcws->config_char_handles.value_handle)
        {
            p_ctcws->config_write_handler(p_ctcws, p_evt_write->data, p_evt_write->len);
        }
    
        //take care of a write to samples
        else if (p_evt_write->handle == p_ctcws->samples_char_handles.value_handle)
        {
            p_ctcws->samples_write_handler(p_ctcws, p_evt_write->data, p_evt_write->len);
        }
    
        //take care of a write to status
        else if (p_evt_write->handle == p_ctcws->status_char_handles.value_handle)
        {
            p_ctcws->status_write_handler(p_ctcws, p_evt_write->data, p_evt_write->len);
    
            if ((p_evt_write->handle == p_ctcws->status_char_handles.cccd_handle) && (p_evt_write->len == 2)){
                if (ble_srv_is_notification_enabled(p_evt_write->data)){
                //notifications are on
                }
                else{
                //notifications are off
                }
            }
    }

    The other characteristic doesn't have a write event because it's data is transmit only, I would just create a new write handle and all that for that one then implement similar to the status one above?

    Also, how does this status get back to the main program?.. this service thing is the one part of the code I never really got my head around... it all just looks like an enormous web of data structures and things I can't follow.

    Also, just thinking out loud, what happens if the notifications are on by default, will this ever get called? Is there a way I can just check notification status in my main.c? That would be way easier, if I could just check before sending a packet.

  • Do you have anyone with more software experience that can have a look?

    If you look at the ble_app_uart example, in the same event handler:

        if ((p_evt_write->handle == p_nus->tx_handles.cccd_handle) &&
            (p_evt_write->len == 2))
        {
            if (p_client != NULL)
            {
                if (ble_srv_is_notification_enabled(p_evt_write->data))
                {
                    p_client->is_notification_enabled = true;
                    evt.type                          = BLE_NUS_EVT_COMM_STARTED;
                }
                else
                {
                    p_client->is_notification_enabled = false;
                    evt.type                          = BLE_NUS_EVT_COMM_STOPPED;
                }
    
                if (p_nus->data_handler != NULL)
                {
                    p_nus->data_handler(&evt);
                }
    
            }

    The line:

    evt.type                          = BLE_NUS_EVT_COMM_STARTED;

    and 

    p_nus->data_handler(&evt); You don't see this event in the event handler in main.c, but that is only because it is not checked.

    If you add this event in the nus_data_handler() event handler in main.c you can see it:

    static void nus_data_handler(ble_nus_evt_t * p_evt)
    {
    
        if (p_evt->type == BLE_NUS_EVT_RX_DATA)
        {
            uint32_t err_code;
    
            NRF_LOG_DEBUG("Received data from BLE NUS. Writing data on UART.");
            NRF_LOG_HEXDUMP_DEBUG(p_evt->params.rx_data.p_data, p_evt->params.rx_data.length);
    
            for (uint32_t i = 0; i < p_evt->params.rx_data.length; i++)
            {
                do
                {
                    err_code = app_uart_put(p_evt->params.rx_data.p_data[i]);
                    if ((err_code != NRF_SUCCESS) && (err_code != NRF_ERROR_BUSY))
                    {
                        NRF_LOG_ERROR("Failed receiving NUS message. Error 0x%x. ", err_code);
                        APP_ERROR_CHECK(err_code);
                    }
                } while (err_code == NRF_ERROR_BUSY);
            }
            if (p_evt->params.rx_data.p_data[p_evt->params.rx_data.length - 1] == '\r')
            {
                while (app_uart_put('\n') == NRF_ERROR_BUSY);
            }
        }
        else if (p_evt->type == BLE_NUS_EVT_COMM_STARTED)
        {
            NRF_LOG_INFO("notifications started");
        }
    
    }

    will forward the event to the event handler that is passed on in ble_nus_init(). But in your custom project, I am not sure whether you have implemented this kind of event handler. But it is possible.

    This characteristic (the TX characteristic) is not a write characteristic either, but since it is possible to enable notifications on it, it is possible to write to the CCCD of the characteristic. This will appear as an BLE_GATTS_EVT_WRITE event. 

    I don't know anything about how your characteristic is set up. Are notifications supported?

    If you want a getting started guide, you can check out this github guide:

    https://github.com/edvinand/custom_ble_service_example

    However, I believe that you have already made a working application, so this guide may not suit very well. 

    This is at least how you can check whether the notifications are enabled, but how to forward this to main.c, is not Nordic SDK specific, but this is one way to do it in C.

    Best regards,

    Edvin

  • So I can't get that to work, no other coder here, just me. I'm a pretty competent C coder but this SDK is so hard to follow (very common complaint around the internet.) I was going to implement a time out, since that's pretty easy... if the program gets stuck, force a timeout and go back to advertising. However, when I hit this snag it seems to be disabling all other interrupts. My SPIM gets hung up waiting for an event end interrupt, and my RTC interrupt to reset everything doesn't trigger.

    I set my RTC interrupt to 0 to try and over ride whatever the SD is doing. I'm totally lost now... I get that some SD events are priority 0, but I don't know how/why writing to a characteristic that doesn't have notifications set on the other side would trigger a priority 0 interrupt.

    Any thoughts? I'm beating my head against the wall here.

  • Adam Gerken said:
    I set my RTC interrupt to 0 to try and over ride whatever the SD is doing. I'm totally lost now... I get that some SD events are priority 0, but I don't know how/why writing to a characteristic that doesn't have notifications set on the other side would trigger a priority 0 interrupt.

     Are you talking about the interrupt priorities now? If so, don't change them. It shouldn't be needed in this case. Setting the RTC interrupt IRQ priority to 0 will damage the softdevice functionality.

    Set the RTC priority back to 6.

    Are you able to see from your application when the notification is enabled? If yes, I suggest you look into using the app_timer as your timer to check whether notifications are enabled. Check out the ble_app_hrs example, which uses an app_timer to generate simulated battery measurements.

    Best regards,

    Edvin

  • Yes, I'm talking about interrupts. When my application sends data to a client that doesn't have notifications enabled, the program hangs up and no interrupts work any more. I'm trying to use a timer to reset the connection and go back to advertising when this happens, but it doesn't work because interrupts don't work. No interrupts work when this happens.... My SPI hangs up, Timer hangs up, RTC hangs up, everything.

    No, I can not see in my application when notifications are enabled. That's exactly what I said I can't get to work.

    How does a timer help me check for an event when notifications are enabled? I'm not following you at all here, I think we're talking about very different things. Are you suggesting using a timer to poll the notification status? I asked a while ago if I can just check the status of notifications, because then I can just do that before trying to transmit on that characteristic, and just not transmit if notifications aren't enabled. That would be the ideal situation, but I thought you said you can't query the notification status.

    The basic flow of the program:

    Server = Nordic device, Client = PC

    • Server advertises
    • Client connects
    • Client requests x number of data samples
    • Server collects samples
    • Server transmits samples 

    It's the last step that's getting hung up if the client doesn't have notifications enabled. The Server needs to transmit a large amount of data but just locks up.

Reply
  • Yes, I'm talking about interrupts. When my application sends data to a client that doesn't have notifications enabled, the program hangs up and no interrupts work any more. I'm trying to use a timer to reset the connection and go back to advertising when this happens, but it doesn't work because interrupts don't work. No interrupts work when this happens.... My SPI hangs up, Timer hangs up, RTC hangs up, everything.

    No, I can not see in my application when notifications are enabled. That's exactly what I said I can't get to work.

    How does a timer help me check for an event when notifications are enabled? I'm not following you at all here, I think we're talking about very different things. Are you suggesting using a timer to poll the notification status? I asked a while ago if I can just check the status of notifications, because then I can just do that before trying to transmit on that characteristic, and just not transmit if notifications aren't enabled. That would be the ideal situation, but I thought you said you can't query the notification status.

    The basic flow of the program:

    Server = Nordic device, Client = PC

    • Server advertises
    • Client connects
    • Client requests x number of data samples
    • Server collects samples
    • Server transmits samples 

    It's the last step that's getting hung up if the client doesn't have notifications enabled. The Server needs to transmit a large amount of data but just locks up.

Children
  • I think you are rushing ahead:

     

    Adam Gerken said:
    When my application sends data to a client that doesn't have notifications enabled, the program hangs up and no interrupts work any more.

     Perhaps this is because the ble_nus_data_send() returns != 0?

     

    Edvin said:
    Are you able to see from your application when the notification is enabled?
  • Perhaps this is because the ble_nus_data_send() returns != 0?

    I'm not using nus, it's a custom service but is probably similar. I'm only processing two error codes right now, NRF_SUCCESS and NRF_ERROR_RESOURCES. If the "send_data" service returns success, I compile another packet and send it to the SD until I get an NRF_ERROR_RESOURCES, then I wait for the BLE_GATTS_EVT_HVN_TX_COMPLETE event, and do the whole process again.

    I just tried looking for other error codes and quitting the transmission if "send_data" returns something other than SUCCESS or ERROR_RESOURCES but it still hangs up before I can even look at what the code is.

    Are you able to see from your application when the notification is enabled?

    No. This is the whole point of this whole thread. I don't know how to do this. This is exactly what I want to do.

  • Hello Adam,

    Do you have "DEBUG" in the preprocessor symbols for your project? I suspect that your project is calling sd_ble_gatts_hvx and then asserting when it receives the NRF_ERROR_INVALID_STATE error code. If DEBUG was defined then this can put you into an infinite loop at the bottom of the default app_error_fault_handler, with all of your application's interrupts disabled. If you remove the DEBUG symbol then the assert handler will initiate a system reset instead.

  • I don't believe so. My preprocessor defines are just:

    CONFIG_GPIO_AS_PINRESET

    CONFIG_NFCT_PINS_AS_GPIOS

    FLOAT_ABI_HARD

    INITIALIZE_USER_SECTIONS

    NO_VTOR_CONFIG

    NRF52840_XXAA

    NRF_SD_BLE_API_VERSION=6

    S140

    SOFTDEVICE_PRESENT

    SWI_DISABLE0

    I do believe that I'm getting NRF_ERROR_INVALID_STATE back after I call my send_data, which does call sd_ble_gatts_hvx. The issue I'm running into now, is what do I do then? If I call sd_ble_gap_disconnect I also just get NRF_ERROR_INVALID_STATE back. I think this is where it's all hung up.

    I don't do an error check after calling my send data routine though, so I'm not sure why it would be getting hung up to the point of disabling all the interrupts. I only look for SUCCESS, or ERROR_RESOURCES for program logic stuff.

    Here's the routine that's calling my "send_data"

    void transmit_samples(void){
    
        ret_code_t            err_code;
        uint16_t              bytes_to_tx, packet_bytes_to_tx;
        uint8_t               ble_packet[DEFAULT_MAX_MTU_SIZE];
        uint8_t               checksum;
    
        //send packets until we've filled the buffer, or we're done.
        if (requested_samples > transmitted_samples){
    
          while((err_code != NRF_ERROR_RESOURCES) && (requested_samples > transmitted_samples)){
            
            /*FORM THE PACKET*/
    
            //send the packet!
            err_code = ble_ctcws_send_data(m_conn_handle, &m_ctcws, ble_packet, &packet_bytes_to_tx);
        
            //if we didn't get an error, and we're not done: incriment the number of samples and packets
            if (err_code == NRF_SUCCESS){
              transmitted_samples += (bytes_to_tx/2);
              packet_number ++;
            }
          }//while != NRF_ERROR_RESOURCES
    
        }//if (requested_samples > transmitted_samples)
    
        //if we're done, let the device know
        else {
            send_application_status(TRANSFER_DONE);
            #ifdef DEBUG
            printf("Transmission done, sent %d samples in %d packets\r\n",transmitted_samples, packet_number);
            #endif
            rtc_set(DEFAULT_TIMEOUT);
            requested_samples = 0;
        }
    
    }//transmit_samples

    Here's my "send_data":

    uint32_t ble_ctcws_send_status(uint16_t conn_handle, ble_ctcws_t * p_ctcws, uint8_t * p_data, uint16_t * p_length)
    {
        ble_gatts_hvx_params_t params;
    
        memset(&params, 0, sizeof(params));
        params.type   = BLE_GATT_HVX_NOTIFICATION;
        params.handle = p_ctcws->status_char_handles.value_handle;
        params.p_data = p_data;
        params.p_len  = p_length;
    
        return sd_ble_gatts_hvx(conn_handle, &params);
    }

  • Is your transmit_samples function called in an interrupt context? If so, no other interrupts can preempt it unless they are a strictly greater priority (lower number).

    If sd_ble_gap_disconnect is returning NRF_ERROR_INVALID_STATE then "Disconnection in progress or link has not been established." Not sure how you debugged it but maybe the first call to sd_ble_gap_disconnect succeeded and then the second call reported an error and got your attention?

Related