CosmosDB: Guid IDs instead of object IDs of Vertex?

Question

CosmosDB: Guid IDs instead of object IDs of Vertex?

BenjaminAbt opened this issue 3 years ago · comments

Is there any sample / way we can work with a typed value like Guid instead of object? for the Vertex Id?
Object is extremely error-prone..

We have already tried to work with Guid directly or a struct-based abstractions.
This works so far on reading, but on vertex writing diff argument exceptions are thrown like

Gremlin.Net.Driver.Exceptions.ResponseException: InvalidRequestArguments: 

ActivityId : a6f0053f-ab75-4b99-92bb-6f0934182c84
ExceptionType : ArgumentException
ExceptionMessage :
	Value of variable _b is not a constant type. Cannot assign complex values to groovy variable. (Parameter 'value')
Source : Microsoft.Azure.Cosmos.Gremlin.Core
	HResult : 0x80070057

We could not find any reference to this in the docs either.

Thanks for your help!

Daniel Weber · Answer 1 · Wed Feb 17 2021 01:51:56 GMT+0800 (China Standard Time)

IDs in CosmosDb are always strings. That's the closest you can get.

Daniel Weber · Answer 2 · Wed Feb 17 2021 01:55:43 GMT+0800 (China Standard Time)

I transferred the issue to ExRam.Gremlinq for future reference.

Daniel Weber · Answer 3 · Wed Feb 17 2021 02:03:19 GMT+0800 (China Standard Time)

A more extensive opinion on that matter: As IDs in CosmosDB are always strings, even when they look like Guids, don't use Guids on your POCOs to represent them. It'll explode as soon as you encounter a custom ID that does not parse as Guid. Also, on deserialization, you'll lose the original representation of the Guid - hyphens, cases, etc. Even though Gremlin could be forced to work with Guids just fine, don't go there if you have control over the source code of your POCOs. Just use strings.

BEN ABT · Answer 4 · Wed Feb 17 2021 02:03:33 GMT+0800 (China Standard Time)

Technically you are right that an Id is declared as a string in CosmosDB. By default, however, CosmosDB uses guids as content rep.
Nevertheless, the typed representation is actually the safer choice, regardless of the content implementation.

My wish would actually be a generic way of specifying the type.

// Default vertex
public class Vertex :  IVertex
{
    public object? Id { get; set; }
    public string? Label { get; set; }
    public string PartitionKey { get; set; } = "PartitionKey";
}

// Typed vertex Id
public class Vertex :  IVertex
{
    // we know every Id is technically a guid; so we could parse the content (strict);
    //  works on all reads, but fails on all writes like V.Add();
    public Guid? Id { get; set; } 

    public string? Label { get; set; }
    public string PartitionKey { get; set; } = "PartitionKey";
}

// Generic vertex Id
public class Vertex :  IVertex<object>
{
    public string? Label { get; set; }
    public string PartitionKey { get; set; } = "PartitionKey";
}
// aka
public class Vertex :  IVertex<Guid>
{
    public string? Label { get; set; }
    public string PartitionKey { get; set; } = "PartitionKey";
}

public interface Vertex<TVertexId>
{
    public TVertexId? Id { get; set; } // soft
}

The object currently forces us to use an insecure signature (because it is error-prone) or very complex abstractions.

I'm absolutely with you on saving and handling the API through/with CosmosDB and Gremlin.
However, I am focusing the code-safe implementation with the API / the models.

Daniel Weber · Answer 5 · Wed Feb 17 2021 02:07:20 GMT+0800 (China Standard Time)

Why won't strings work for you?

BEN ABT · Answer 6 · Wed Feb 17 2021 02:11:02 GMT+0800 (China Standard Time)

Strings work. That was not my concern.

My intention was more on typing: The wish is that corresponding method signatures can work with typed values and not only with strings and objects. In addition, there is the mapping of vertex objects to projections and other models (aka dto or whatever).
Using Object and String everywhere is simply error-prone - no more, no less :-)

Daniel Weber · Answer 7 · Wed Feb 17 2021 02:12:56 GMT+0800 (China Standard Time)

Not sure why you don't consider a string a "typed value".

Also, I'm not sure I understand what you mean by "CosmosDB uses guids as content rep.". Assigning a custom Id to a vertex that just says "myCustomId" is perfectly fine in CosmosDB and it's not a Guid. Technically, using a Guid in your POCOs is dangerous.

What's error prone about a string? Sorry, I don't get the rationale.

Daniel Weber · Answer 8 · Wed Feb 17 2021 02:22:17 GMT+0800 (China Standard Time)

You can still force your method signatures to use Guids, if Guids are central to the domain you're programming for, and do the appropriate conversions. But when dealing with Ids on CosmosDB, strings are the only right representation of IDs. I could not encourage you to keep using Guids because it'll blow up sooner or later.

BEN ABT · Answer 9 · Wed Feb 17 2021 02:32:55 GMT+0800 (China Standard Time)

Thanks for all your answers so far!

I'm still not concerned with the Guid per se, but with the handling of the API and its signatures.
I am not interested in the content of the string, but how to address / use methods.
I don't care what the Id represents in terms of content. The Guid is simply the default behavior of CosmosDB.

Imagine a signature like
Task MyMethod(object personId, object cityId, object carId)

Since the base type object is used everywhere, you have to pay a lot of attention to the correct order and even that you don't accidentally pass a completely wrong value / use a diff value source.

The signature with strings is a bit better, because then at least you can pass only strings. But of course there remains the factor that the Id can be anything.
Task MyMethod(string personId, string cityId, string carId)

We can actually only achieve a true type-safe signature with physically separated types.
Task MyMethod(PersonId personId, CityId cityId, CarId carId)

A corresponding possible implementation of the Id could look like this:

    public readonly struct PersonId
    {
        public string Value { get; }
        public PersonId(string value) => Value = value;

        public static implicit operator PersonId(string id) => Parse(id);
        public static implicit operator string(PersonId id) => id.Value;

This may look like overhead at first, but it is not or hardly at all at runtime, but it ensures easier handling of models across the application. The advantage is that while we talk about strings for all content values, the separate structs give us higher type safety for both methods and models.

This works fine for reading vertices (Netwonsoft.Json recognizes here accordingly the operators and serializes perfectly), but for writing vertices the API is not able to process the struct properly, because (I guess) it sees the struct itself and not the value.

Daniel Weber · Answer 10 · Wed Feb 17 2021 02:39:21 GMT+0800 (China Standard Time)

There's always the possibility to use a custom serializer, like it has been done for the CosmosDbKey-struct (a combination of id and partitionKey for use with g.V(...)). You will have to define these for every custom Id type though. For deserialization, I guess Newtonsoft.Json handles proper conversion just magically.

Daniel Weber · Answer 11 · Wed Feb 17 2021 02:43:43 GMT+0800 (China Standard Time)

BTW the IVertex interface does not have to be used. Gremlinq is just fine without. Of course, IVertex defines the object Id in the first place but it's not necessary.

BEN ABT · Answer 12 · Wed Feb 17 2021 02:45:56 GMT+0800 (China Standard Time)

Thanks for the hints. That's what we were looking for!

Daniel Weber · Answer 13 · Wed Feb 17 2021 03:36:36 GMT+0800 (China Standard Time)

May this be closed then ?

BEN ABT · Answer 14 · Wed Feb 17 2021 03:41:06 GMT+0800 (China Standard Time)

Yes, we can close this for now.

Looks like we have to create a bunch of extension methods to get this working (because the implementation of V() does not work with structs) but I guess we will reach our goal of type safety! :-)

Thanks!

Daniel Weber · Answer 15 · Wed Feb 17 2021 03:46:49 GMT+0800 (China Standard Time)

V works fine with structs, as this test shows and CosmosDbKey is a struct. There'll be boxing involved of course.

BEN ABT · Answer 16 · Wed Feb 17 2021 03:54:28 GMT+0800 (China Standard Time)

We can now serialize / deserialize our Struct with the vertices but every V() methods responds with

Value of variable _b is not a constant type. Cannot assign complex values to groovy variable. (Parameter 'value')

So if a struct had to work either the serializer does not like our struct or we have forgotten something unconsciously..

Daniel Weber · Answer 17 · Wed Feb 17 2021 03:57:06 GMT+0800 (China Standard Time)

You'll have to override serialization as shown here and of course ultimately serialize it to a string. The error message you get is not from Gremlinq.

BEN ABT · Answer 18 · Wed Feb 17 2021 17:18:55 GMT+0800 (China Standard Time)

Yes, it comes from Gremlin.NET

Gremlin.Net.Driver.Exceptions.ResponseException: InvalidRequestArguments

but it must be related to this. It works with object but does throw this ex with our struct.

Our registration based on this.

public static IGremlinQueryEnvironment RegisterGraphModelSerialization(this IGremlinQueryEnvironment e)
{
    e.ConfigureSerializer(s =>
        s.ConfigureFragmentSerializer(fs =>
            fs.Override<PersonId>((fragment, environment, overridden, recurse)
                => recurse.Serialize(fragment.Value, environment))));

    return e;
}

We also tried to attach .ToGroovy() and we also tried to return (object)fragment.Value;

// Register Options
.ConfigureOptions(options => options.SetValue(WebSocketGremlinqOptions.QueryLogLogLevel, LogLevel.None))
// Register Custom Stuff
.RegisterGraphModelSerialization()
// Register Database
.UseCosmosDb(builder => builder

I guess we'll have to invest more time to dig through the source code tho

Daniel Weber · Answer 19 · Wed Feb 17 2021 17:26:54 GMT+0800 (China Standard Time)

Immutability is king. Everything is immutable. If you call something on e and return e, you've done nothing. Return the result of Configure.XYZ.

BEN ABT · Answer 20 · Wed Feb 17 2021 17:28:34 GMT+0800 (China Standard Time)

Oh. embarrassing mistake.....

BEN ABT · Answer 21 · Thu Feb 18 2021 22:35:44 GMT+0800 (China Standard Time)

Serialization and Deserialization works now. But I guess there is a bug in the Linq implementation.

If we use Where(v=>v.Id == personId); the following passage passes null into our implicit operator but passes the value if we use string or object.

https://github.com/ExRam/ExRam.Gremlinq/blob/89f6ba298db53ce0d9a28a73280a529e730d74e4/src/ExRam.Gremlinq.Core/Serialization/GremlinQueryFragmentSerializer.cs#L217

When we have more time to investigate the bug, then we would create an issue.
Currently the workaround is that we work with V(personId) instead of Where, as this works.