uuid-rs / uuid

Generate and parse UUIDs.

Home Page:https://www.crates.io/crates/uuid

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

base64 serde module

dhduvall opened this issue · comments

Motivation
I have a JSON document with a normal string representation of a UUID in it (five pieces with dashes). I use serde to deserialize into a struct whose type for that field is Uuid. I hand this off to the MongoDB client crate and dump it into the database (which is actually Azure CosmosDB). I then use mongosh to pull it out and write it to a file:

fs.writeFileSync('out.json', JSON.stringify(db.alerts.find({"_id": "<id string>"}, {}).toArray()[0]))

The resulting file has the UUID fields encoded as base64, rather than the normal UUID representation.

It would be nice to be able to feed that document back into my code and have it recognize those fields as valid UUIDs without having to go and manually change them.

Solution
I'm not sure, but I think that this could look like adding another serde module like the existing compact module, or adding another format type.

Alternatives
I'll probably end up with a custom deserialize_with that first tries the standard uuid deserialization, and if that fails, try base64. Shouldn't be a whole lot of code, but it might be useful to someone else in the future to have this baked in.

Is it blocking?
No.

Anything else?
Nothing.

Hmm that's interesting, does the MongoDB client serialize to a non-human-readable format, which translates serde's bytes type into a base64 encoded string? In human-readable formats we should serialize to a hyphenated string.

If you've already got data in your system in this format then coming up with a base64 decoder in your code might be the best way to go. It might be a bit niche to include in this library, but this issue will probably help anybody else who runs into the same problem.

I'm not sure where the UUID is being converted into base64; it may even be happening more than once (possibly on its way into the database, possibly on its way into the file). I suspect that it's happening, like you say, because some part of that pipeline has the UUID in its binary form, but without enough type information to recognize that has a special serialization, so when it needs to be stored in a JSON doc, it ends up with generic base64 serialization. I don't know enough about the way MongoDB or its client work to understand this in more than a hand-wavy way.

Whatever the mechanism, it ends up as base64 in the file, and if I want to read the file back in again, I have to do something special. For anyone who needs it, here's what I ended up with:

  /// Try to deserialize a UUID first, and if that fails, try base64-decoding and then converting to
  /// a UUID.  This assumes that the input is deserializable as a String.  This is useful when
  /// pulling documents out of MongoDB, which will have serialized the UUID byte blob as base64.
  fn deserialize_uuid_base64<'de, D>(deserializer: D) -> Result<Uuid, D::Error>
  where
      D: Deserializer<'de>,
  {
      use serde::de::Error;

      // If we can't decode from a string, then something's wrong.
      let s = String::deserialize(deserializer)?;

      // If Uuid can make sense of that, great.
      if let Ok(Uuid) = Uuid::from_str(s.as_ref()) {
          return Ok(uu);
      }

      // Otherwise, try to base64-decode.
      let uuid_bytes = base64::decode(s).map_err(Error::custom)?;

      // Convert the resulting Vec<u8> to &[u8] and see if that's a UUID.
      Uuid::from_slice(uuid_bytes.as_slice()).map_err(Error::custom)
  }

Thanks for the note @dhduvall 👍 Hopefully that will help anybody else who finds themselves in this same pickle. I think this is a bit out-of-scope for uuid to handle itself so I'll go ahead and close this one, but even as a closed issue this will be a useful resource for anybody searching how to deal with Uuids getting serialized as base64 strings.