Data Queues crashing Zephyr code

  1. Hello,
    I am writing a code that has 2 threads. One puts data in a queue and the other gets the data from the queue. The problem is after I get the data from the queue and it's printed, the device gives an error and reboots. I think the problem might be in my definition of the queue, since I have a different queue that works normally. Here is how I define it:
    in queues.h
    struct data_time{
    uint64_t time1;
    uint64_t time2;
    uint64_t time3;
    uint64_t time4;
    };
    K_MSGQ_DEFINE(data_timeq, sizeof(struct data_time), 4, 1);
    In main:
    struct sensor_q_adrrs{
    struct k_msgq* msgq_data_pointer;
    struct k_msgq* msgq_time_pointer;
    };
    struct sensor_q_adrrs sensor_q = {
    .msgq_data_pointer = &data_senq,
    .msgq_time_pointer = &data_timeq,
    } ;
    In the other files that I use so it's connected to the main:
    struct k_msgq* time_q1;
    void register_msgq_addres2(struct k_msgq* ptr_from_main_data, struct k_msgq* ptr_from_main_time){
             printk("\n gets to register_msgq_addres1");
             data_q1 = ptr_from_main_data; time_q1 = ptr_from_main_time;
    }
    I would really appreciate it if any tips can be given on what I do wrong with the definition.
    The error I get when the code reboots is: 
    Faulting instruction address (r15/pc): 0x00000000
  • Hello Svetlio,

    Your code looks similar to some code I was recently writing.  Given that you are able to "get the data from the queue and it's printed", your Zephyr queues are likely configured and working.  The error seems to indicate a program counter (pc) pointing to the memory address "zero".  I  believe that's what the error message means.  I've gotten this when a pointer I thought was assigned a valid address is actually NULL.  The value or entity "NULL" often translates to zero in many programming languages and contexts, though not all.

    If you add an IF construct to test whether your pointers `data_q1` and `time_q1` are NULL where they are referenced outside of main.c routines, what result do you see?  You can just printk() a short message if they're null, instead of referencing them in the needed way.  Quick test to see whether they're getting assigned the memory addresses you intend them to have.

    - Ted

  • Hello,

    I tried printing the results of time_q1 and data_q1 and I actually get addresses so the problem is not that they are NULL.

    I was thinking that the problem might be in the definition here:

    K_MSGQ_DEFINE(data_timeq, sizeof(struct data_time), 4, 1);

    Since the data that I am putting in the queue is uint64_t and I am not sure if the aligning is correct since I had problems understanding it's purpose.

  • Hello Svetlio,

    My apologies, you may be correct about the question of alignment, which we get to specify in the fourth parameter to K_MSGQ_DEFINE(...).  I should have looked up the notes on this Zephyr macro.  Given the macro use notes at:

    https://docs.zephyrproject.org/2.6.0/reference/kernel/data_passing/message_queues.html?highlight=k_msgq_define#c.K_MSGQ_DEFINE

    I would try changing the last parameter to `sizeof(uint64_t)` or `sizeof(long int)`.  While Zephyr's documentation reads,

    "...The buffer is aligned to a q_align -byte boundary, which must be a power of 2. To ensure that each message is similarly aligned to this boundary, q_msg_size must also be a multiple of q_align."

    and 2^0 equals 1, I wonder whether Zephyr developers meant to say "non zero and positive integer power of two".  The last sentence in the quote above makes me question this way.  If and when q_align has the value 1, then it's trivial to say that q_msg_size or any other parameter should be a multiple of q_align, because all positive integers greater than 1 are themselves multiples of 1.

    If it turns out `sizeof(uint64_t)` as a fourth parameter fails similarly, you could try aligning on the size of your struct data_time.

    Sorry I am not able to run a quick test.  I've been working with function pointers and pointers to data structures, but am not yet using Zephyr queues.  Not sure if I can fully recreate your code in a Zephyr template or empty app of my own.  I see how `data_timeq` is defined, via that macro.  But I don't see how `data_senq` is defined.  I'd be guessing to try and define it with a similar use of K_MSGQ_DEFINE(...).

    This looks like an interesting problem, though, and I'm interested to understand how to solve it from a standpoint of using the queue macros correctly.

    - Ted

  • Hi, 

    From the team:

    The message queue definition looks fine, and the used alignment to one byte should not matter here. However, there is too little code to say something more - no definition of data_senq, it is not shown how an element is taken from the queue, when register_msgq_address2 is called, etc. Is it possible to elaborate in more detail? It's better to put some more effort into preparing this question if you expect some real help. The full error message could also be helpful. 

    Regards,
    Amanda

  • Hello,

    The data_senq is not a problem and the code just with it works without a problem that's why I didn't include more of it in the explanation.

    For the other parts, the .h file I changed a bit:


    struct data_time{
        uint64_t time1;
    };

    K_MSGQ_DEFINE(data_timeq, sizeof(struct data_time), 1, 1);

    This is how I call the time_q in data_gen.c file to withe a variable in the queue:

    struct k_msgq* time_q;

    int data_gen(){
        uint64_t message_ts = 0;

        err = date_time_now(&message_ts);
        if (err) {
            printk("date_time_now, error: %d\n", err);
            return err;
        }
        aws_put_to_dataq_timeq(sine_table[i], message_ts);
    }
    void register_msgq_addres1(struct k_msgq* ptr_from_main_data, struct k_msgq* ptr_from_main_time){
        data_q = ptr_from_main_data;
        time_q = ptr_from_main_time;
    }
    void aws_put_to_dataq_timeq(uint8_t put_data, uint64_t put_time){
       
        if (data_q == NULL){
            printk("Eda msgq is null! Did you forget to call register_msgq_address?");
            return;
        }
        if (time_q == NULL){
            printk("Eda msgq is null! Did you forget to call register_msgq_address?");
            return;
        }
       
        k_msgq_put(data_q, &put_data, K_NO_WAIT);
        k_msgq_put(time_q, &put_time, K_NO_WAIT);
     
    }
    and this is in MQTT_AWS.c file how I read the data from the queue:
    static int data_publish()
    {
        int err=0;
        char *message;
        uint64_t message_ts = 0;
        uint8_t sen_val = 0;
     

        if(k_msgq_num_used_get(data_q1) > 0){
            if(k_msgq_num_used_get(time_q1) > 0){
                aws_get_from_q(&message_ts, &sen_val);
                if (root_obj == NULL){
                    cJSON_Delete(root_obj);
                    err = -ENOMEM;
                    return err;
                }

                err += json_add_number(root_obj, "sen", sen_val);
                err += json_add_number(root_obj, "ts", message_ts);

                if (err) {
                    printk("json_add, error: %d\n", err);
                    goto cleanup;
                }

                message = cJSON_Print(root_obj);
                if (message == NULL) {
                    printk("cJSON_Print, error: returned NULL\n");
                    err = -ENOMEM;
                    goto cleanup;
                }

                struct aws_iot_data tx_data = {
                    .qos = MQTT_QOS_0_AT_MOST_ONCE,
                    .topic.type = AWS_IOT_SHADOW_TOPIC_UNKNOWN,
                    .topic.str = topic_string,
                    .topic.len = sizeof(topic_string),
                    .ptr = message,
                    .len = strlen(message)
                };


                   
               
                printk("Publishing: %s to AWS IoT broker\n", message);

                err = aws_iot_send(&tx_data);
                if (err) {
                    printk("aws_iot_send, error: %d\n", err);
                }

                cJSON_FreeString(message);
               
            }
           
        }
       
    cleanup:
        cJSON_Delete(root_obj);
       
        return err;
    }
    void aws_get_from_q(uint64_t* get_time, uint8_t *get_data){
            k_msgq_get(time_q1, get_time, K_NO_WAIT);
            k_msgq_get(data_q1, get_data, K_NO_WAIT);
    }

    void register_msgq_addres2(struct k_msgq* ptr_from_main_data, struct k_msgq* ptr_from_main_time){
        data_q1 = ptr_from_main_data;
        time_q1 = ptr_from_main_time;
    }
    Currently the code gets data out and it diesn't crash, but the data is not correct and it's always the same number. It's UNIX timestamp, but for a completely different time that it currently is.
Related