jhelovuo / RustDDS

Rust implementation of Data Distribution Service

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Data reader not receiving all samples

CountryTk opened this issue · comments

Hi, I've come across another problem.

Seemingly out of the blue (I updated RustDDS to the latest commit) my data reader has become unable to read messages that aren't sent by the writers created by RustDDS.

I start up my program and it starts sending heartbeat messages to IntercomDDS.
Heartbeat data reader only "reads" the messages I sent to the IntercomDDS and not the ones I should get from IntercomDDS.

I confirmed IntercomDDS is working fine by listening to those messages with FastDDS and it works, I see the heartbeat I send and the heartbeat they send.

How should I start debugging the issue?

Also, there's a bug in cdr_deserializer.rs on line 258 where
let bytes_withouT_null = &bytes[0..bytes.len() - 1]; will fail when bytes.len() is empty

If this is because of upgrading to latest GitHub master branch commit, but an earlier version works, please use git bisect on RustDDS to find out where the bug appears. It should then be easier to inspect the source changes.

Also, there's a bug in cdr_deserializer.rs on line 258 where
let bytes_withouT_null = &bytes[0..bytes.len() - 1]; will fail when bytes.len() is empty

Did this actually fail by panic or some other symptom?
Or did you discover this by reading the source?

The code is somewhat hazardous, but should mostly work. The CDR specification states "The string length includes the null character, so an empty string has a length of 1." In such a case, we would get bytes[0..0] , which is (correctly) an empty slice in Rust.

But a wrongly serialized input data would cause a panic, so this should be fixed.

Okay so I found the line of code that made my readers unable to read data from IntercomDDS

src/dds/reader.rs line 574. Commit Reader to ignore data from unknown writers added this line

if writer_guid.entity_id.entity_kind.is_user_defined() {return;}

after removing it, i can successfully receive messages from IntercomDDS. Is this perhaps a misconfiguration in my RustDDS code?

Also about the bytes.len(), I found that bug when I got a response from IntercomDDS where it sent me an empty string as a part of a response and that made rust throw a subtract overflow panic.

Okay so I found the line of code that made my readers unable to read data from IntercomDDS

src/dds/reader.rs line 574. Commit Reader to ignore data from unknown writers added this line

if writer_guid.entity_id.entity_kind.is_user_defined() {return;}

after removing it, i can successfully receive messages from IntercomDDS. Is this perhaps a misconfiguration in my RustDDS code?

This is most likely a symptom that the RustDDS Reader is not matching to IntercomDDS Writer, and therefore messages are rejected as not originating from a matched Writer. Check your QoS settings. And logs for match/mismatch messages on that Topic.

This line of code is necessary, because if you remove that check, messages may "leak" from one Topic to another.

Also about the bytes.len(), I found that bug when I got a response from IntercomDDS where it sent me an empty string as a part of a response and that made rust throw a subtract overflow panic.

Ok, so it is a bug.

As per the CDR specification,
"A string is encoded as an unsigned long indicating the length of the string in octets,
followed by the string value in single- or multi-byte form represented as a sequence of
octets. The string contents include a single terminating null character. The string
length includes the null character, so an empty string has a length of 1."

Can you confirm with Wireshark that IntercomDDS follows this spec? Or does it send an empty string as length=0, followed by perhaps the null character? If the serialized length field is zero, then that would be a bug in IntercomDDS also.

The documentation I was given says that the heartbeat reader should follow the default QoS settings but I can't find any concrete information about what the default QoS settings are supposed to be so in my RustDDS implementation I left them empty, so I just built QoS without any parameters.

I looked on wireshark and looks like the serialized length field is zero which indicates a fault in IntercomDDS but I think it's still worth a fix to make the system more fault tolerable. Should I make a PR? I would just have it take an empty slice when bytes.len() is empty. (I tested it out and it worked in my case)

The documentation I was given says that the heartbeat reader should follow the default QoS settings but I can't find any concrete information about what the default QoS settings are supposed to be so in my RustDDS implementation I left them empty, so I just built QoS without any parameters.

Compatible QoS is not the only requirement. Topic names and type names must also match. Reader logs at "info" level should tell if there is a match or not.

I looked on wireshark and looks like the serialized length field is zero which indicates a fault in IntercomDDS but I think it's still worth a fix to make the system more fault tolerable. Should I make a PR? I would just have it take an empty slice when bytes.len() is empty. (I tested it out and it worked in my case)

Yes, please make a PR. Returning an empty slice is reasonable, but how do we continue CDR deserialization then? How many bytes of the incoming message are you going to consume in deserialize_str if the length deserializes as zero?

If the bug in IntercomDDS is that the string length actually does not include terminating nul character, and is always one too small, then one additonal byte should be consumed (via next_bytes()) . But if the length is wrong only in the case of an empty string, then no data past the length field should be consumed.

A more strict error handling would return an Err(..), but that would not work with IntercomDDS then.

Returning an empty slice is reasonable, but how do we continue CDR deserialization then? How many bytes of the incoming message are you going to consume in deserialize_str if the length deserializes as zero?

To be honest I'm not quite sure I understand. If I get sent a struct that contains 2 integer fields and 1 string field and the string field is empty then deserialize_string just consumes an empty slice and returns an empty string.

But if the length is wrong only in the case of an empty string, then no data past the length field should be consumed.

Does this mean that if I get sent a struct with
struct Message { u32 id; string message; u32 status; } if the message field is an empty string it shouldn't read the data past it? So u32 status wouldn't be read?

But if the length is wrong only in the case of an empty string, then no data past the length field should be consumed.

Does this mean that if I get sent a struct with struct Message { u32 id; string message; u32 status; } if the message field is an empty string it shouldn't read the data past it? So u32 status wouldn't be read?

For example, if you have

Message{ id: 3 , message: "", status: 5, }

A correct CDR serialization is

Each row below is 4 bytes, or one 32-bit word:

| 03 00 00 00 |     id=3, LSB first
| 01 00 00 00 |     string length, including NUL
| 00 XX XX XX |    nul character + 3 x pad byte
| 05 00 00 00 |    status=5

Now, if we receive a message payload

| 03 00 00 00 |     id=3, LSB first
| 00 00 00 00 |     string length, including NUL (wrong, cannot be 0)
| ?? ?? ?? ?? |   <-- How to decode this? Is this string nul terminator + pad or is it status?
| ...         |

If the string length field is already something the spec does not allow, how do we read the rest of the string? Where does status field begin?

The IntercomDDS version we use is actually a bit outdated, I hope they have fixed this issue in newer releases.

If the spec doesn't allow it, I think there isn't going to be a solution that isn't hacky and hacky solutions are bad so I think it should just return an Err() instead.

I'm going to use my own fork of RustDDS where I "circumvent" this issue but only because of the old / buggy IntercomDDS version we have.

You can also make a PR of your hack. Just state clearly in your comments that it is a hack, and the purpose (IntercomDDS, which version).

PR made