gklijs / schema_registry_converter

A crate to convert bytes to something more useable and the other way around in a way Compatible with the Confluent Schema Registry. Supporting Avro, Protobuf, Json schema, and both async and blocking.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support additional local protobuf definitions

jeffutter opened this issue · comments

The protobuf definitions in our Schema Registry contain some references to the Google common protos which are not in the registry, but rather shipped locally with our applications.

It would be helpful if this library allowed loading additional proto definitions to be used when decoding with ProtoDecoder.

It may be possible to do this by hand with protofish and the raw proto decoder, but it would be helpful if this were baked in.

I know those were added to the Confluent Schema registry, some time after I last changed some things. I think this would indeed be a nice addition.

Tobe sure I understand the problem correctly:
Current behavior: when there is a reference to a Google common proto some error is given. (Something like can't resolve reference).
Wanted behavior: Google common proto definitions are included in the crate, and used when there are references to it.

Hey 👋, thanks for the quick reply. Some more details I should have included when I submitted the initial request. Here are some debug logs when decoding a protobuf message against our schema registry:

[2022-08-05T22:01:41Z DEBUG reqwest::connect] starting new connection: https://myacct.confluent.cloud/
[2022-08-05T22:01:41Z DEBUG reqwest::async_impl::client] response '200 OK' for https://myacct.confluent.cloud/schemas/ids/100023?deleted=true
[2022-08-05T22:01:41Z DEBUG reqwest::async_impl::client] response '404 Not Found' for https://myacct.confluent.cloud/subjects/google/protobuf/timestamp.proto/versions/1
thread 'main' panicked at 'Error getting value: Error: Could not get id from response had no other cause, it's retriable: false, it's cached: false', src/main.rs:44:23

you can see it loaded the initial schema, but then failed finding the Google common one.

Small update, had some trouble getting my local working again. I did added an additional schema to schema_registry_test_app which should make it easy to add the integration test, and work trough how it's working.

For now I just added the types, it was a bit of hassle since there did not seem to be an easy to use crate. So I ended up updating the lexer used to get the imports, and get those schema's so they can be processed. See #84 feel free to reopen this issue, or add another for other use cases.

👋 Sorry to take so long to finally get back to this. Thanks for adding that set of protos. I'm running into issues with more expected types. Specifically the google "well known types". More specifically I have protobufs that expect google.protobuf.Timestamp which is documented here: https://developers.google.com/protocol-buffers/docs/reference/google.protobuf and the source for which is here: https://github.com/protocolbuffers/protobuf/blob/main/src/google/protobuf/timestamp.proto

I'm not sure if there is a canonical list of these protos or if it's just the other ones mentioned in those docs and littered through that folder in the source code. If it helps any, it does look like the rust protobuf library has some code-gen to generate modules for all of these: https://github.com/stepancheg/rust-protobuf/blob/f7259280e1b64a8bafd505b688d35f93552d0a38/protobuf-codegen/src/gen/well_known_types.rs#L84

Timestamp is included, but only as part of the common type interval. The import thing is if it's supported in Java, which does seems to be the case.
It's all a bit confusing to me, since the 'common types' from the last link, doesn't actually contain the proto's with com.google.type namespace.

Since being compatible with Java is the main goal, if Timestamp, and related ones, indeed work, this can be considered a bug. Adding them will be easy. Please note that this only 'solves' deserializing. For serialization you might also need the protobuf crate.

Ahh yes - I was a little sleepy when posting that last night :-D I did make a little progress this morning. I was able to decode the protobuf in question by making this patch to schema_registry_converter:

diff --git a/src/proto_common_types.rs b/src/proto_common_types.rs
index 9afbee5..f760925 100644
--- a/src/proto_common_types.rs
+++ b/src/proto_common_types.rs
@@ -32,6 +32,7 @@ enum CommonType {
     Quaternion,
     TimeOfDay,
     Type,
+    Timestamp,
 }
 
 #[derive(Clone, Debug, PartialEq)]
@@ -79,6 +80,7 @@ fn is_common_import(import: &str) -> Option<CommonType> {
         "google/type/quaternion.proto" => Some(CommonType::Quaternion),
         "google/type/timeofday.proto" => Some(CommonType::TimeOfDay),
         "google/type/type.proto" => Some(CommonType::Type),
+        "google/protobuf/timestamp.proto" => Some(CommonType::Timestamp),
         _ => None,
     }
 }
@@ -103,6 +105,7 @@ fn get_schemas(common_type: CommonType) -> &'static [CommonSchema] {
         CommonType::Quaternion => &[CommonSchema::Quaternion],
         CommonType::TimeOfDay => &[CommonSchema::TimeOfDay],
         CommonType::Type => &[CommonSchema::Type],
+        CommonType::Timestamp => &[CommonSchema::Timestamp],
     }
 }

Seems like we're expecting it in a different namespace...

I need to double-check, but I guess the 'other' ones are supported on the Java side. I'll try to add and release a fix next weekend.