w3c / did-test-suite

W3C DID Test Suite and Implementation Report

Home Page:https://w3c.github.io/did-test-suite/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How should a byte steam be encoded into JSON for a dereferenced Did URL that points to an external resource outside the DID Document?

kdenhartog opened this issue · comments

Currently the only examples in the test suite for the executions.output.contentStream property in the dereferencer data model input is stringified JSON. How should we encode a different media type such as image/jpeg into the result?

Currently the only examples in the test suite for the executions.output.contentStream property in the dereferencer data model input is stringified JSON. How should we encode a different media type such as image/jpeg into the result?

At present, the test suite isn't designed to do this and probably won't get that functionality by the time CR ends.

If someone needs this functionality, they will have to come up with a PR to support it. @peacekeeper might have done more thinking on this since that's his part of the test suite.

The proposed method I've been working towards is converting the data to a data URI. So for a image/jpeg it will be data:image/jpeg;base64,<base64encodedImage>. This seems like a robust way to represent any byte stream of a resource in JSON. Does anyone see any problems with this approach?

@kdenhartog I am not sure I understand the question.

But, I may have been working on similar solutions.

Are you wondering how to embed arbitrary data in a JSON object?

Or do you want to put arbitrary data in your DID Document, such that said data gets returned when the right DID-URL is dereferenced?

You could put the data in a data: URL and put that data:URL as a service endpoint, but I wouldn't recommend it.

I am confused, in part, because embedding != external resource, so I think I lost what you were trying to get at.

What's the use case you're trying to solve for?

Currently the way that we have to provide the input and outputs for our test results have to be encoded into A JSON object for dereferencing. As an example did:web:kyledenhartog.com?service=dogPicService&relativeRef=JwIZJYD.gif should map to https://i.imgur.com/JwIZJYD.gif which results in a stream of bytes. In order to put this data into executions.output.contentStream we need some way to encode the resulting resource that has been dereferenced. So the original question was how should we do that? My proposal is to encode it using a data URI so that we can support any arbitrarily referenced resource.

My proposal is to encode it using a data URI so that we can support any arbitrarily referenced resource.

Yes, this works @kdenhartog -- and given that no one else has proposed something, and that what you're suggesting should work for the purposes of the test suite, we're good.

I'm not sure we need to test this as a part of the test suite, though? Is there a dereferencing test that would exercise this capability?

So, this is something I've been doing a lot of work with recently. My opinion is that data: urls are insufficient for a number of reasons, BUT they can work in many instances. The problem is that data:urls only provide a media-type for content description. It doesn't provide for alternative encodings, compression, or encryption. See the Learning and Employment Record for an approach that embeds arbritary content in JSON-LD. https://drive.google.com/file/d/1RfdXAUNhp0kluD9htpb8c_Tg2dLm2QcJ/view)

FWIW, there is also a new property for DID Documents I've been working on called "LinkedResources" which may be a better way to do what you're trying to do. It's supported by DID-core and will be submitted to the DID Spec Registries, but is not ready for submission as we are proving out the concept with multiple implementations before socializing. Ping me separately if you'd like to learn more.

I'm still confused by your use case however.

Taking your example,

did:web:kyledenhartog.com?service=dogPicService&relativeRef=JwIZJYD.gif

should map to

https://i.imgur.com/JwIZJYD.gif

which results in a stream of bytes. In order to put this data into

executions.output.contentStream

I'm not following this last bit at all. It seems like you are testing a method-specific dereferencing semantic with a parameter that is not well defined (highlighting the tribal wisdom issues already raised).

My interpretation of how one would process your proposed did is this:

  1. Parse DID-URL
    a. DID : did:web:kyledenhartog.com
    b. service : dogPicService
    c. relativeRef=JwIZJYD.gif
  2. Resolve DID Document for did:web:kyledenhartog.com
  3. Find service listing in DID Document with an id of did:web:kyledenhartog.com#dogPicService
  4. That would be something like
service:[{
  "id":"#docPicService",
  "type":"webWithRelativeRef",
  "serviceEndpoint":"https://i.imgur.com"
}...]

Note, currently there are only two service types listed in the did spec registries, so the value of the "type" property is ambiguous, but that type property should definitively explain how you work with that service endpoint. I just made up a type that I am interpreting as saying this is a normal web link which is intended to work correctly when you append the RelativeRef property.

Continuing the flow:
5. That service endpoint gets interpreted with the append RelativeRef semantic to return the URL https://i.imgur.com/JwIZJYD.gif
6. Dereference that URL using your preferred https compatible network library
a. the result is the stream of bytes representing that gif, in whatever packaging your library provides

In this flow, I don't understand why there is any confusion about what gets put into executions.output.contentStream, in part because it seems that it MUST be whatever is returned in the payload for that gif, and presumably the header data from that retrieval gets mapped to contentMetadata. And this is entirely the responsibility of the resolver--not a function of DID Core. How and what you put into that dereferencing response is currently method-specific and otherwise pre-standards track with the DID Resolution spec.

Personally, I would not build out an implementation this way. I would simply take the DID Document, transform the DID-URL into a URL with the service property semantics, and then dereference that URL.

In particular,

<img src="did:web:kyledenhartog.com?service=dogPicService&relativeRef=JwIZJYD.gif">

IMO, should work in any DID enabled browser to actually render that image.

We have so much implemented in web libraries that the current dereferencing semantics don't make much sense to me. What I think we need is for dids to be usable as URLs, full stop. We probably don't want to do that by reinventing dereferencing, we want to do that by defining outputs that standard browser agents can use.

Sure, you can also create bespoke dereferencing (such as pulling something from the chain) that, would by its nature, be method-specific, but this pattern of transforming a DID-URL into a URL is going to be vital for a reasonable transition of current web-centric applications to DID-aware applications. The actual dereferencing in that vast array of use cases should just use standard web dereferencing.

So, let me try the use case question again.

What is the real-world value-creating interaction that you are trying to enable? The IMG tag example above doesn't need the dereferencing API and will likely never use a DID resolver to do the dereferencing.

Further, the dereferencing contract is stated in an abstract form. You're basically asking for a non-abstract solution, which is out of scope for DID-Core, IMO.

On the other hand, I would love to see a test using a data URL for a service, which returns the image via http (although I still feel that is testing the resolver and not did-core):

service:[{
  "id":"#docPicService",
  "type":"dataUrl",  
"serviceEndpoint":""
}...]

(fwiw, that is the data:url for the gif at https://i.imgur.com/JwIZJYD.gif

I'm not sure we need to test this as a part of the test suite, though? Is there a dereferencing test that would exercise this capability

My intent was to add one to show the usage of service=dogPicService&relativeRef=JwIZJYD.gif parameters since they were at risk. That's the main use case for why those parameters would be used in my mind. I'm sure there's plenty of other ones though.

What is the real-world value-creating interaction that you are trying to enable?

A DID URL that behaves like a PURL.

The IMG tag example above doesn't need the dereferencing API and will likely never use a DID resolver to do the dereferencing.

Certainly, but it requires a DID enabled browser which we don't currently have.

You're basically asking for a non-abstract solution, which is out of scope for DID-Core, IMO.

Agreed. Only reason I'm having to produce a non-abstract solution is because something needs to go in the executions.output.contentStream and to date we've only had JSON objects. Surely a DID URL can point to something other than a JSON object right?

On the other hand, I would love to see a test using a data URL for a service, which returns the image via http (although I still feel that is testing the resolver and not did-core)

Yeah I do think we're quite close to heading into resolver land so saying this must be done in some normative way is incorrect in my eyes. I'm more just heading in the direction of "What do I put in there when the DID URL references an external resource... oh I'll just put an encoded string in there since it at least fits in a JSON object". I suspect others may head in a different direction which is fine until DID Resolution defines how this should work.

@kdenhartog wrote

Certainly, but it requires a DID enabled browser which we don't currently have.

To be clear, it only needs an http request library, which are available in just about every language on every platform. You don't need a full browser.

@kdenhartog wrote

Certainly, but it requires a DID enabled browser which we don't currently have.

To be clear, it only needs an http request library, which are available in just about every language on every platform. You don't need a full browser.

While that's true http libraries are really all that is needed, we also don't support DID URLs at the moment either. Hence the need to find a way to transform a DID URL to a HTTP URL when fetching an external resource not contained within the DID Document so that we can glue the two concepts together.

@kdenhartog few comments..

  1. If all we want to do is remove the at-risk markers for service and relativeRef, then I think we don't need to actually test dereferencing DID URLs that have these parameters. It should be sufficient to submit a test report for the did-identifier test suite, which simply tests the syntax (also see my comment w3c/did-core#708 about "Separating Identification from Interaction").

  2. Having said that, I fully agree with the DIDs-as-PURLs use case, and I'm excited to see it get implemented! This was the reason for introducing these parameters in the first place.

  3. (Side note: Of course with matrix parameters this would have worked even better, since this would have allowed us to do "partial redirection" with DID PURLs - see slides here for an explanation.)

  4. If in the test suite you want to express that the contentStream contains binary data, I suggest we use hex encoding, to be consistent with the example of resolveRepresentation() that also contains e.g. binary CBOR data: https://github.com/w3c/did-test-suite/blob/main/packages/did-core-test-server/suites/implementations/resolver-example-didwg.json#L131. What do you think about that?

  5. In your example DID URL did:web:kyledenhartog.com?service=dogPicService&relativeRef=JwIZJYD.gif, I would consider the option that the dereference() process may actually NOT return the binary GIF directly - instead it could return something like the following:

{
    "dereferencingMetadata": {
      "resourceUrl": "https://i.imgur.com/JwIZJYD.gif"
    },
    "contentStream": null,
    "contentMetadata": {}
}

Retrieving the binary GIF data would then be a second (purely HTTP based) dereferencing process. This may be more consistent with how today's HTTP-based PURLs work, which redirect you to a resource and trigger a second dereferencing process, rather than returning the resource itself in the first step. But these details are really out of scope for DID Core and we should collaborate on DID Resolution to specify this.

If in the test suite you want to express that the contentStream contains binary data, I suggest we use hex encoding, to be consistent with the example of resolveRepresentation() that also contains e.g. binary CBOR data: https://github.com/w3c/did-test-suite/blob/main/packages/did-core-test-server/suites/implementations/resolver-example-didwg.json#L131. What do you think about that?

Was happy to convert to to hex, but prefer your proposal in number 5 instead. Will go with that.

In your example DID URL did:web:kyledenhartog.com?service=dogPicService&relativeRef=JwIZJYD.gif, I would consider the option that the dereference() process may actually NOT return the binary GIF directly - instead it could return something like the following

I actually way prefer this methodology. Will go in this direction instead and implement the test to behave in that way as well.

Retrieving the binary GIF data would then be a second (purely HTTP based) dereferencing process. This may be more consistent with how today's HTTP-based PURLs work, which redirect you to a resource and trigger a second dereferencing process, rather than returning the resource itself in the first step. But these details are really out of scope for DID Core and we should collaborate on DID Resolution to specify this.

+1 agreed

@peacekeeper in order to comply with a few of the already written tests I had to modify the output to be:

{
            "contentStream": "https://i.imgur.com/KW6NCtG.jpg",
            "dereferencingMetadata": {
                "contentType": "text/url"
            },
            "contentMetadata": {}
}

Does that work for you? My implementation is passing all the tests now with this change.

@kdenhartog yes I like this! You're right, my earlier example was incorrect, since the dereferencing part of the spec says that contentStream MUST contain a resource.