This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

nRF52840 random disconnection

Good morning,

We have a new equipment based on the nRF52840 and s140.

                Just now, a new problem has appeared and it is very strange, as it seems to be a ‘random’ error:

 

                We have already assembled tens of equipment, and most of them are perfectly working with any smartphone (using both your ‘nRF Connect’ application or our own application).

                But, we have a few smartphones that show a strange behavior: they are able to perfectly connect to some of our nRF52840-based equipment, and in a few ones, they are able to connect but they get disconnected after a few seconds. Same equipment is connecting to other smartphones without problems (without disconnecting).

                This is (step by step) what it is happening a few times:

  • Equipment (nRF52840) is advertising.
  • Smartphone stablish connection with equipment using pin number
  • First time, the connection is perfectly working.
  • If smartphone is disconnected and connected back again with same equipment (Already bonded), then connection is stablished but smartphone disconnects after 30”, approximately. Same behavior is happening with both nRF connect application and our own application. Looking at log in nRF application we see: ‘Connection terminated by peer (status 19)’or ‘GATT_CONN_TIMEOUT’

 

Why is it happening?

Which is the reason?

How can we solve this?

 

 Could you help us? (For your information, we have set MIN_CONN_INTERVAL to 7.5m and MAX_CONN_INTERVAL to 100ms)

Thank you

Parents
  • Hi Daniel, 

    I got the update from Brian Kim about your email discussion. I agree with Brian that the sniffer trace would play a very important role here to solve the issue. I have some comments: 

    - Regarding the sniffer trace, I could see the communication because the link was encrypted. What you need to do is to either do Legacy pairing and use the sniffer to follow the initial bonding. Or use LESC but in debug mode. 

    - Another option you can try is to turn off bonding and encryption requirement to see if the issue is related to bonding or not. If the issue remain, please capture a sniffer trace. 

    - The disconnection happened at second 26th into the connection so it may not related to the 30seconds GAP/GATT timeout.

    - Please try to do a chip erase on the defected board and test again.

    - Please try to test using one of our example in the SDK, ble_app_proximity for example. 

Reply
  • Hi Daniel, 

    I got the update from Brian Kim about your email discussion. I agree with Brian that the sniffer trace would play a very important role here to solve the issue. I have some comments: 

    - Regarding the sniffer trace, I could see the communication because the link was encrypted. What you need to do is to either do Legacy pairing and use the sniffer to follow the initial bonding. Or use LESC but in debug mode. 

    - Another option you can try is to turn off bonding and encryption requirement to see if the issue is related to bonding or not. If the issue remain, please capture a sniffer trace. 

    - The disconnection happened at second 26th into the connection so it may not related to the 30seconds GAP/GATT timeout.

    - Please try to do a chip erase on the defected board and test again.

    - Please try to test using one of our example in the SDK, ble_app_proximity for example. 

Children
  • Hi Hung,

    I will try to catch the trace you are asking about (even without bonding encryption).

    In what regards last two points: just to point out that same chip is perfectly working with other Smartphones. And as Brian probably has said to you regarding the smartphone that is showing problems:

    - First connection (when entering pin code, bonding) has never problems

    - Next connections (equipment already bonded, so pin is not asked) is when smartphone disconnects after this 26". Also, take into account that, sometimes, it works fine. 

    - When other Smartphones connect to same equipment, they are always working fine.

    - When affected Smartphone is connecting to other equipment, it is also working fine with most of them --> Problem appears with this Smartphone (and other models I sent to Brian) and a few nRF5240-based equipment (not to all of them: and all of them are identical)

    Maybe it is a bit difficult to understand all the scenarios detailed above. If you need more details, maybe we can arrange a virtual meeting:  in this case, send me any link via mail.

    Best regards,

    Dani.

  • Hi again,

    I have programmed one equipment turning bonding off, and then it is imposssible to connect (please, look at the attached screenshot with nRF Connect).

    In order to set it off, I set my variable 'BONDING_WITH_NO_CODE' to 1, and I have implement this in 'peer_manager_init' function:

    static void peer_manager_init(void)
    {
        ble_gap_sec_params_t sec_param;
        ret_code_t           err_code;
    
        err_code = pm_init();
        APP_ERROR_CHECK(err_code);
    
        memset(&sec_param, 0, sizeof(ble_gap_sec_params_t));
    
        // Security parameters to be used for all security procedures.
    		if(BONDING_WITH_NO_CODE)
    		{
    			sec_param.bond         = 0;    
    			sec_param.oob          = 0;
    			sec_param.mitm         = 0; 
    			sec_param.io_caps      = BLE_GAP_IO_CAPS_NONE;
    			sec_param.kdist_own.enc  = 0;
    			sec_param.kdist_own.id   = 0;
    			sec_param.kdist_peer.enc = 0;
    			sec_param.kdist_peer.id  = 0;
    		}
    		else
    		{
    			sec_param.bond           = SEC_PARAM_BOND;
    			sec_param.oob            = SEC_PARAM_OOB;
    			sec_param.mitm           = SEC_PARAM_MITM; //Static Key
    			sec_param.io_caps        = SEC_PARAM_IO_CAPABILITIES;    
    			sec_param.kdist_own.enc  = 1;
    			sec_param.kdist_own.id   = 1;
    			sec_param.kdist_peer.enc = 1;
    			sec_param.kdist_peer.id  = 1;
    		}
    		
            
        sec_param.lesc           = SEC_PARAM_LESC;
        sec_param.keypress       = SEC_PARAM_KEYPRESS;    
        sec_param.min_key_size   = SEC_PARAM_MIN_KEY_SIZE;
        sec_param.max_key_size   = SEC_PARAM_MAX_KEY_SIZE;
        
    
        err_code = pm_sec_params_set(&sec_param);
        APP_ERROR_CHECK(err_code);
    
        err_code = pm_register(pm_evt_handler);
        APP_ERROR_CHECK(err_code);
    }

    Furthermore, in this case, in all services, I proceed in the following way:

    if(BONDING_WITH_NO_CODE)
    {
    	BLE_GAP_CONN_SEC_MODE_SET_OPEN(&attr_md.read_perm);
    	BLE_GAP_CONN_SEC_MODE_SET_OPEN(&attr_md.write_perm);			
    }		
    else
    {
    	BLE_GAP_CONN_SEC_MODE_SET_ENC_WITH_MITM(&attr_md.read_perm);
    	BLE_GAP_CONN_SEC_MODE_SET_ENC_WITH_MITM(&attr_md.write_perm);
    }

    Hope to go ahead with the solution.

    Best regards,

    Dani.

  • Good afternoon,

    I'm just looking around this issue, as we need to have the solution implemented ASAP.

    In the 'ble_evt_handler' function, I have implemented the BLE_GATTC_EVT_TIMEOUT and BLE_GATTS_EVT_TIMEOUT in the same way, as follows:

    case BLE_GATTC_EVT_TIMEOUT:
                // Disconnect on GATT Client timeout event.            
                err_code = sd_ble_gap_disconnect(p_ble_evt->evt.gattc_evt.conn_handle,
                                                 BLE_HCI_REMOTE_USER_TERMINATED_CONNECTION);
                APP_ERROR_CHECK(err_code);
    						
                break;
    
            case BLE_GATTS_EVT_TIMEOUT:
                // Disconnect on GATT Server timeout event.            
                err_code = sd_ble_gap_disconnect(p_ble_evt->evt.gatts_evt.conn_handle,
                                                 BLE_HCI_REMOTE_USER_TERMINATED_CONNECTION);
                APP_ERROR_CHECK(err_code);
    						
                break;

    When the disconnection appears, the equipment receives the BLE_GATTS_EVT_TIMEOUT event.

    If I comment the functions implemented in this event, the problem with the disconnection disappears.

    Is it a good workaround? What is exactly the difference between these two events? Which other problems can raise if I comment the implementation of the BLE_GATTS_EVT_TIMEOUT  event?

    Best regards,

    Dani

  • Hi Daniel, 

    I'm sorry for the late reply. It was Norway's national day yesterday.
    The BLE_GATTC_EVT_TIMEOUT, BLE_GATTS_EVT_TIMEOUT happen when a read/write from the client or an indication from the server doesn't have a reply from the peer device. So it must be something wrong with the peer device and the default behavior is to disconnect. It's most likely the peer device is not able to handle the command and can't reply for some reason. 

    If you remove the disconnect request as in the code, the connection can remain but we don't know what's wrong with the peer device and the peer may have unexpected behavior. 

    In your application do you do any write request, read request or indication ? 

    My suggestion is to try again with no bond and capture a sniffer trace. We really need to know what happen of the the air. Please try to capture the sniffer when you couldn't connect. 

    I suspect that if there is no bond involved the issue will not occurs. In that case you may need to consider using Legacy bonding instead of LESC, this way the sniffer can follow the legacy bonding and can get the bond information to decrypt the connection when they are connected after bonding.. 

  • Thank you so much.

    I will try to catch that trace with sniffer.

    In what regards legacy bonding: which is the parameter I have to set instead of LESC? (Some messages above, I specified how bonding is setup)

    Kind regards,

    Dani

Related