Dear NCS experts,
I'm working with NRF Connect SDK (1.7.1) on a proprietary Bluetooth Mesh model. When the model receives its configuration, it stores it in flash by calling bt_mesh_model_data_store(). Next time the model is powered on, it loads its configuration through bt_mesh_model_cb.settings_set callback.
Now the problem is, that the data read in bt_mesh_model_cb.settings_set start with four leading bytes (could be an int) that do not belong to the model's configuration. The model's actual configuration start at the fifth byte of the read data - but is four bytes short.
It kinda looks as if bt_mesh_model_cb.settings_set is reading from an address that's sizeof(int) before the address where the actual data is stored. Could also be that bt_mesh_model_data_store() writes to an address that's four bytes behind - haven't been able to debug deep down yet. What's a little remarkable is the value of the first four bytes which, as integer translate to: 0x20000710 - which matches the address where the bt_mesh_model structure is stored in memory. Coincidence? Maybe...
The question now is, what could cause this behavior and how to fix it?
Now, honestly, I doubt that a bug like this would go unnoticed in Zephyr, so let's better have a look at my code:
This is how the model's user data looks like:
#define MAX_VALUES 16
#define MAX_NAME_LENGTH 30
struct model_value
{
uint16_t time;
uint8_t value;
};
/*
* Size in memory: 104 bytes: (10 + 64 + 30)
* Required buffer (packed): 88 bytes: (10 + 48 + 30)
*/
struct my_configuration
{
uint32_t id;
int32_t uptime_correction;
uint8_t max_in_percent;
uint8_t value_cnt;
struct model_value values[MAX_VALUES];
char name[MAX_NAME_LENGTH];
};
struct my_model_user_data
{
struct bt_mesh_model *model;
struct my_configuration model_cfg;
struct k_mutex cfg_mutex;
};
The "set configuration" opcode handler (below) just parses the incoming data, validates it, and saves it to flash.
Note: At the time the mutex is released, new_cfg (as well as user_data->model_cfg) contain a valid configuration.
static int handle_message_set_config(struct bt_mesh_model *model, struct bt_mesh_msg_ctx *ctx, struct net_buf_simple *buf) {
struct model_user_data *user_data = model->user_data;
struct my_configuration new_cfg;
parse_config(&new_cfg, buf);
if (!is_configuration_valid(&new_cfg)) {
return -1;
}
if (k_mutex_lock(&user_data->cfg_mutex, K_MSEC(WAIT_FOR_MUTEX)) == 0) {
memcpy(&user_data->model_cfg, &new_cfg, sizeof(struct my_configuration));
bt_mesh_model_data_store(model, true, DATA_STORE_KEY, &new_cfg, sizeof(struct my_configuration));
k_mutex_unlock(&user_data->cfg_mutex);
}
return reply_status(model, ctx, MSG_SET_CFG_STATUS);
} Edit: Looking at the above code, it kinda looks like bt_mesh_model_data_store() ignores the fourth parameter and uses the address of the model's my_model_user_data structure instead. Just a gut feeling... Can somebody with some in depth knowledge please confirm? That would perfectly explain the behavior.
The method to read the data during boot looks like this:
static int my_model_settings_set(struct bt_mesh_model *model, const char *name, size_t len_rd, settings_read_cb read_cb, void *cb_arg)
{
struct my_model_user_data *user_data = model->user_data;
if (NULL == name) {
return -ENOENT;
}
struct my_configuration configuration;
ssize_t bytes_read = read_cb(cb_arg, &configuration, sizeof(struct my_configuration));
/*
* Here's where the problem appears:
* 1. bytes_read matches sizeof(my_configuration)
* 2. first 4 bytes read are invalid, making the whole content of my_configuration invalid.
* 3. Actual configuration starts at the fifth byte read.
* 4. Because of the four invalid bytes at the beginning, all values are stored four bytes off and the last four bytes are missing.
*/
if (bytes_read == sizeof(struct my_configuration) && is_configuration_valid(&configuration)) {
memcpy(&user_data->model_cfg, &configuration, sizeof(struct my_configuration));
} else {
// We end up here because is_configuration_valid() recognizes the data corruption.
}
return 0;
} Simple code... Still I would appreciate if you could lend a hand. Do you see anything that's obviously wrong? Your help is very much appreciated,
Thank you,
Michael.
P.S.: To whomever is responsible for this forum: Please fix it! It took 10+ reloads until the "edit" button became visible...