This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Best approach for collecting data from hundreds of sensors

Hello, 

I'm looking for some general guidance on the following application:

  • 1 nRF52840 based gateway to cloud, on power supply
  • a few hundred nRF52832 sensor nodes, all battery operated, all within reach of the GW
  • sensor nodes are asleep most of the time (events wake them up, then they send data while a certain condition is met, then go back to sleep after a while).
  • Updates from sensor nodes to gw are 10-15 bytes each, at intervals of 1s-30s.
  • Acks from GW to nodes "required"/very nice to have but lack of acks is not necessarily a show stopper as very few lost updates wouldn't be a massive problem. Just not a perferred starting point for us. 

We have this running fine with GATT connections for up to 20 sensors until we hit the softdevice's concurrent connection limit in the GW.

Questions

1) What approaches would be best suited to scale this beyond 20 sensor nodes? We are currently considering the following options: 

a) BLE/GATT/connection based as per our status quo, but adding some logic on top that builds the lists of sensor nodes to connect to next. First tests seem to indicate the connecting/disconnecting will take a couple of seconds so this maybe to slow to reach above goals. Is there a best practice approach of how to build the scheduler on top of the 20-connections-limit so that the number of devices can increase significantly? 

b) Advertisements only / Beacon. No acks... 

c) mesh. All sensor nodes would be LPNs. All of them are within RF reach of the GW. Would this work with mesh? 

d) ESB. 

Which of these make most sense for above scenario?

2) Are there other alternatives that we should consider? 

3) Any other hints that could help us with the above?

Thanks much in advance!

gj

Parents
  • Hello,

    1) That really depends on the number of nodes, and how often they send whether this is possible or not. 

    100 nodes, 10B per 30 seconds = 33Bps = 0.26kbps

    100 nodes, 15B per 1 second = 1.5kBps = 12kbps

    The issue with mesh networks is that the throughput capacity goes down the more nodes you have. It is really not designed for gathering data from several hundred sensors.

    A standard BLE connection has much higher throughput, but it is as you say, the connection also takes time.

    ESB is possible. It has ACKs, and may be a possiblility for your case. I would recommend to have some sort of "connection" here as well, so that not everyone is broadcasting the sensor data at all times, as it would increase the overall traffic, increasing the chance of packet loss. So basically that every node listens for a specific message, meaning that it is that node's turn to send the payload data.

    BLE connections and disconnection is also a possibility. In order to speed up the connection/disconnection time, you can decrease the connection interval on your devices (the initial connection intervals). Shortening this will allow you to be able to send data quite quick.

    So I think BLE or ESB is the way to go. It is difficult to say what would work best. As mentioned, in ESB, I would look into having some sort of communication protocol, so that not everyone is sending the sensor data at will, to reduce the chance for packet loss. In BLE I would recommend to keep the number of connections (far) below 20. Maybe only 2-3, or maybe even only 1 at the time. This way the central can use the radio 100% for scanning whenever it isn't connected, meaning that if it is looking for one particular node that it hasn't been connected to in a while, it will increase the chance of picking up the advertising packet for that node. 

    Whether you use BLE or ESB, I would recommend that the sensor nodes go quiet for as long as possible after connection and sensor dumping, leaving the air traffic to the nodes that hasn't been connected in a while. 

    So, in BLE, things are a bit more random, while in ESB, you can create your own protocol. Maybe the gateway can broadcast a message with a node ID telling everyone that this node should send the data now.

    It is an interesting problem. I think the throughput is too high for a mesh network of that size. Connecting and disconnecting with that number of nodes is on the edge of what is possible in BLE. The ESB protocol that I imagine you would have to implement is maybe a bit more work, but it may be the best shot if it is not possible with BLE. You can do some testing on how fast you can receive the payload (10-15 Bytes) from a single node using BLE and ESB. 

    I also recommend you read through this guide on advertising (even though you go for ESB), as it gives some perspective on scanning and broadcasting. Note that it is the time you actively use the radio that affects the current consumption. So advertising is cheap, while scanning is expensive, power wise.

  • We looked at BLE first.

    BLE connections and disconnection is also a possibility. In order to speed up the connection/disconnection time, you can decrease the connection interval on your devices (the initial connection intervals). Shortening this will allow you to be able to send data quite quick.

    Our setup:

    • Used ble_uart_app and ble_uart_app_c as peripheral and central  respectively (SDK 15.3.0).
    • reduced # of connections to 1.  

    sequence for measuring time taken for one data collection cycle:

    • Start scanning 
    • Start time measurement 
    • Connect to the device (MTU negotiation, Service Discovery and check if characteristics exist) 
    • Receive data from peripheral 
    • Disconnect peripheral  
    • Stop time measurement 
    • Start scanning ... 

    Parameters changed vs. example projects:

    • central

    #NRF_BLE_SCAN_SCAN_INTERVAL    80 
    #NRF_BLE_SCAN_SCAN_WINDOW   80

    • peripheral

    #define APP_ADV_INTERVAL 32 
    #define MIN_CONN_INTERVAL            MSEC_TO_UNITS(7.5, UNIT_1_25_MS)  
    #define MAX_CONN_INTERVAL           MSEC_TO_UNITS(40, UNIT_1_25_MS)  
    
    #define SLAVE_LATENCY 0  
    #define CONN_SUP_TIMEOUT MSEC_TO_UNITS(4000, UNIT_10_MS) 
    #define FIRST_CONN_PARAMS_UPDATE_DELAY APP_TIMER_TICKS(5000) 
    #define NEXT_CONN_PARAMS_UPDATE_DELAY APP_TIMER_TICKS(30000)  
    #define MAX_CONN_PARAMS_UPDATE_COUNT 3  

    Results:

    BLE UART central example started. 
    Scan-Started @ 0000 ms 
    Connecting to target 35BD962F40DA 
    ATT MTU exchange completed @  0106 ms 
    Ble NUS max data length set to 0xF4(244) 
    Discovery complete @  0374 ms
    Connected to device with Nordic UART Service. @  0374 ms 
    Receiving data. @ 0464 ms 
    Disconnected. 
    Scan-Started @ 0523 ms 
    Disconnected. conn_handle: 0x0, reason: 0x16 
    Disconnected @  0523 ms 

    Average was 612 ms per cycle. That doesn't sound right. Any ideas what we might be missing?

    Thanks,
    gj

  • Hello,

    Since you already have this project up and running where you measure the time between the events, can you try to set both MIN_CONN_INTERVAL and MAX_CONN_INTERVAL to 7.5 ms on the peripheral? And do the same on the central. On the central this is set in sdk_config.h:

    NRF_BLE_SCAN_MIN_CONNECTION_INTERVAL

    NRF_BLE_SCAN_MAX_CONNECTION_INTERVAL

    Set both of these to 7.5 as well.

    That may shorten the time it takes for discovery complete and to transfer the data. How much does it impact the time?

    BR,

    Edvin

  • Thank you, Edvin!

    Results:

    
    

    BLE UART central example started. 
    Scan-Started @ 0000 ms 
    Connecting to target 35BD962F40DA 
    ATT MTU exchange completed @  0054 ms 
    Ble NUS max data length set to 0xF4(244) 
    Discovery complete @  0128 ms 
    Connected to device with Nordic UART Service. @  0128 ms 
    Receiving data. @ 0247 ms 
    Disconnected. 
    Scan-Started @ 0262 ms
    Disconnected. conn_handle: 0x0, reason: 0x16 
    Disconnected at  0262 ms 

    • Maximum 412 ms  
    • Min 258 ms
    • average 299 ms

    So your info cut overall time by more than 50% - great! 2-3 more tipps of the same calibre will get us into useful territory :-)

  • What seems to be the varying part now? I suspect that the time from "connecting to target" until "Disconnect" is more or less the same, is that correct?

    What you can try is to decrease the advertising interval, but be aware that cutting the advertising interval in half will double the current consumption. I am not sure whether it is possible to shorten it much more using BLE. If you want it quicker, you may need to use ESB, where you control all the packets, and you don't need to wait for service discovery, and all the other packets going back and forth during the establishment of the connection.

    By the way, from the central, when do you enable the notifications?

Reply
  • What seems to be the varying part now? I suspect that the time from "connecting to target" until "Disconnect" is more or less the same, is that correct?

    What you can try is to decrease the advertising interval, but be aware that cutting the advertising interval in half will double the current consumption. I am not sure whether it is possible to shorten it much more using BLE. If you want it quicker, you may need to use ESB, where you control all the packets, and you don't need to wait for service discovery, and all the other packets going back and forth during the establishment of the connection.

    By the way, from the central, when do you enable the notifications?

Children
  • We found an error on our part affecting the previously reported times. Corrected measurements:

    BLE UART central example started. 
    Scan-Started @ 0000 ms 
    Time b/n Scan@ 0000 ms 
    Start Connecting. @ 0011 ms 
    Connecting to target 35BD962F40DA 
    ATT MTU exchange completed @  0056 ms 
    Ble NUS max data length set to 0xF4(244) 
    Discovery complete @  0123 ms 
    Connected to device with Nordic UART Service. @  0123 ms 
    Receiving data. @ 0139 ms 
    Start Disconnecting. @ 0139 ms 
    Disconnected. 
    Scan-Started @ 0153 ms 
    Time b/n Scan@ 0153 ms 
    Disconnected. conn_handle: 0x0, reason: 0x16 
    Disconnected at  0153 ms

    Around 140 ms per cycle.

    I suspect that the time from "connecting to target" until "Disconnect" is more or less the same, is that correct?

    yes, it is around 125 ms-127ms.

    What you can try is to decrease the advertising interval

    We were using the minimum advertising interval (20 ms) already.

    By the way, from the central, when do you enable the notifications?

    As per ble_app_uart_c example project in ble_nus_c_evt_handler (with BLE_NUS_C_EVT_DISCOVERY_COMPLETE event). Unchanged from original Nordic example.

  • Ok. Then I think there is not much more you can do to reduce the time it takes from disconnected -> receive data -> disconnected. 

    If this is still too long for your use case, I think you have to use ESB. 

  • We'll look into ESB. Thank you for your support!

Related