On Names

Question

On Names

mhhf opened this issue 8 years ago · comments

Despite the silent agreement at start of this discussion that we will not try to solve naming and registries there is a lot of talk around their necessity. I think this is due to a misconception on the notion of names and mixing up different levels of abstraction. (manifest, routing, abbreviating/renaming)

The way I see it is that we should separate registries from the manifest design. A manifest should be standalone and work without registries!

I agree with @AFDudley that a "name" for a manifest should be globally unique to produce a clean and deterministic dependency graph. This task is achievable by understanding a manifest name as its checksum like it is done in IPFS already. This yield the desired properties of global uniqueness while maintaining the separation from "registries". Also we don't have to stick with IPFS/Swarms checksum implementation - any checksum would do the job.

I'd also agree with @tcoulter assumption that manifests are for non human actors

Since our off chain actors are primarily software, the message format needs to be easily readable by most programming languages.

Therefore manifest "names" don't need to be human readable.

With this approach a manifest which points to dependency "names" points to the checksum of the dependency manifests which satisfy the task of creating a deterministic and clean dependency graph.

Now the problem becomes how to find a package manifest, given its checksum (routing). If we settle down with IPFS/Swarm this task is trivial since data is content-addressed through its checksum. Otherwise a routing layer on top can be engineered which points from a checksum to a list of URIs.

Now registries are another layer on top for human actors to easily search, find and talk about package manifests without having to care about its "real name". They introduce relative aliases/ abbreviations for the globally unique manifest names.

Example time!:

Let A,B be packages and A depend on B:

B:

{
  ...
  description: "This is package B!",
  ...
}

The name of B is:
sha3(B) = QmWAfT5xDqaEDxtg9S2yENNExbGmidfbaW8RYKuJ32KHd6

A:

{
  ...
  description: "This is package A which depends on B",
  dependencies: [
    "QmWAfT5xDqaEDxtg9S2yENNExbGmidfbaW8RYKuJ32KHd6"
  ],
  ...
}

The name of A is:
sha3(A) = QmSGmuTVMzNuqj3hRUUQTuwgcxoPtXDmCzhAEa2sdVi18H

Now a routing layer can be build:

{
  "QmWAfT5xDqaEDxtg9S2yENNExbGmidfbaW8RYKuJ32KHd6": [
    "https://github.com/awesomecorp/B",
    "http://awesome.co/resources/B
  ],
  "QmSGmuTVMzNuqj3hRUUQTuwgcxoPtXDmCzhAEa2sdVi18H": [
    "https://github.com/littlebilly/A"
  ]
}

And a name registry (mhhf's registry):

{
  "A": "QmSGmuTVMzNuqj3hRUUQTuwgcxoPtXDmCzhAEa2sdVi18H",
  "B": "QmWAfT5xDqaEDxtg9S2yENNExbGmidfbaW8RYKuJ32KHd6"
}

RJ Catalano · Answer 1 · Wed Oct 12 2016 03:51:10 GMT+0800 (China Standard Time)

what if we also want to include traditional DNS? Say I want to link to github...how would we do that?

Denis Erfurt · Answer 2 · Wed Oct 12 2016 03:55:26 GMT+0800 (China Standard Time)

@VoR0220 what do you mean with link to github? this is done in the routing layer as shown above. Wether the github link indeed has the right manifest can be easily verified by checking its checksum.

RJ Catalano · Answer 3 · Wed Oct 12 2016 03:57:19 GMT+0800 (China Standard Time)

hrmmmm....

RJ Catalano · Answer 4 · Wed Oct 12 2016 04:02:32 GMT+0800 (China Standard Time)

so basically this is a simple mapping of sha3'd package source code and its dependencies to an array of string urls, correct?

Tim Coulter · Answer 5 · Wed Oct 12 2016 04:05:53 GMT+0800 (China Standard Time)

I agree with @AFDudley that a "name" for a manifest should be globally unique to produce a clean and deterministic dependency graph.

You can't make a name globally unique. You can easily deploy a different registry on the same chain and publish a package of the same name. Your model implies compliance to a central registry which is not likely to be followed in practice.

Edit: If I could take this comment back I would. I responded quickly without fully understanding. Apologies.

Denis Erfurt · Answer 6 · Wed Oct 12 2016 04:07:15 GMT+0800 (China Standard Time)

@VoR0220 the sha3 represent the hash of the manifest, which includes the sha3's of the contract code or other data, making a package fully reconstructable and verifiable. The point of my argument is that with this, we don't need to worry about the routing on the same level as we worry about the manifest design.

my example routing is just a hashmap mapping manifest names to an uri(in this case a github url) where the user can find the data (in this case the manifest).

Denis Erfurt · Answer 7 · Wed Oct 12 2016 04:09:20 GMT+0800 (China Standard Time)

You can't make a name globally unique. You can easily deploy a different registry on the same chain and publish a package of the same name. Your model implies compliance to a central registry which is not likely to be followed in practice.

@tcoulter A name is a hash of its content which is globally unique without ever touching registries. We just need to form a consensus on which hash function we actually want to use, thats all.
global uniqueness guaranteed! (unless you can break hashing)

Tim Coulter · Answer 8 · Wed Oct 12 2016 04:12:39 GMT+0800 (China Standard Time)

I think we mean different things when we say "name". If we're talking about a hash of content, I'd like to call that an identifier, or just "hash". I don't see users using hashes to reference packages, as they would names (the package manager might, however). There needs to be a step to resolve names to identifiers/hashes.

A. F. Dudley · Answer 9 · Wed Oct 12 2016 04:13:22 GMT+0800 (China Standard Time)

As Jan pointed out to me earlier, it appears that there is confusion regarding what we mean by manifest. We think the manifest should be human readable and included in the directory that contains the package source. There is another file that needs to be on the blockchain which is used for actually linking the contract on a given train. We are able to provide an example of the human readable file in the next couple of days.this is the file I'm concerned with.

Tim, I'm not a fucking idiot, please don't attribute idiotic centralization to me. My model requires nothing of the sort. Thanks.

Tim Coulter · Answer 10 · Wed Oct 12 2016 04:15:06 GMT+0800 (China Standard Time)

Tim, I'm not a fucking idiot, please don't attribute idiotic centralization to me. My model requires nothing of the sort. Thank.

If you're going to talk like that please leave. You're not a victim, and I wasn't attacking you. We all are human, we get confused, and I have the best intentions. I don't want you here if you're going to turn the discussion that direction.

Tim Coulter · Answer 11 · Wed Oct 12 2016 04:15:39 GMT+0800 (China Standard Time)

@mhhf What I was going to say was that I got to that line and immediately commented. I should have read the rest.

Denis Erfurt · Answer 12 · Wed Oct 12 2016 04:16:10 GMT+0800 (China Standard Time)

@tcoulter This is my whole point - we mistaken names for identifiers!
manifests should never even touch names - they should solely work with identifiers.

a registry now maps names to identifiers.

no worries

Tim Coulter · Answer 13 · Wed Oct 12 2016 04:19:54 GMT+0800 (China Standard Time)

manifests should never even touch names - they should solely work with identifiers.

👍

Denis Erfurt · Answer 14 · Wed Oct 12 2016 04:20:06 GMT+0800 (China Standard Time)

We think the manifest should be human readable and included in the directory that contains the package source. There is another file that needs to be on the blockchain which is used for actually linking the contract on a given train.

@AFDudley How would you name "the other file", since I was referring the whole time "the other file" as manifest.
Also: do we even need to agree on the human readable file in the directory? I think this should be handled by every dev-tool individually.

Tim Coulter · Answer 15 · Wed Oct 12 2016 04:25:16 GMT+0800 (China Standard Time)

Also: do we even need to agree on the human readable file in the directory? I think this should be handled by every dev-tool individually.

I don't think we do, but it might be nice for frameworks written in the same language (catering to the same developer audience) to share a common format.

A. F. Dudley · Answer 16 · Wed Oct 12 2016 04:41:02 GMT+0800 (China Standard Time)

Lock file/link file make sense to me. Since it contains all the linking between the contracts. Ultimately, both files will be used by the developers during the full course of development and the format of one file will have an impact on the other since they are mappings.

Tim, I'm sorry frankly addressing your insult, makes you the victim. In the future if my designs don't make sense to you, consider asking for clarification instead of assuming I misunderstand how blockchains work or why people use them.

Denis Erfurt · Answer 17 · Wed Oct 12 2016 04:45:12 GMT+0800 (China Standard Time)

Ultimately, both files will be used by the developers during the full course of development and the format of one file will have an impact on the other since they are mappings.

I'd disagree. I think the dev-tool framework is essentially an abstraction layer over this lock file, providing the user with a richer display as well as more/ dev-tool specific data like front end related data. In that sense a manifest file should be a superset of the lock file. However, we have to clear the terminology here since the distinction of lock and manifest files matters.

Piper Merriam · Answer 18 · Wed Oct 12 2016 04:48:45 GMT+0800 (China Standard Time)

Tim, I'm sorry frankly addressing your insult, makes you the victim. In the future if my designs don't make sense to you, consider asking for clarification instead of assuming I misunderstand how blockchains work or why people use them.

@AFDudley please try to re frame how you are interpreting this stuff because nobody is attacking you.

Tim Coulter · Answer 19 · Wed Oct 12 2016 04:58:29 GMT+0800 (China Standard Time)

@AFDudley I'm very happy to discuss this personally with you, as we don't need to clutter up this ticket. That said, in any future event you think I'm attacking you personally, please bring it up to me directly and let's talk about it. I'm sure 100% of the time it'll be a misunderstanding.

A. F. Dudley · Answer 20 · Thu Oct 13 2016 00:14:44 GMT+0800 (China Standard Time)

I'd disagree. I think the dev-tool framework is essentially an abstraction layer over this lock file, providing the user with a richer display as well as more/ dev-tool specific data like front end related data. In that sense a manifest file should be a superset of the lock file. However, we have to clear the terminology here since the distinction of lock and manifest files matters.

"used by" was vague, sorry.

Yes, tools will generate lock files, and users shouldn't be reading lock files. What I meant was, just like users need to be aware of lock files in other environments because this allow them to debug linking errors and name resolution errors, they will need to be aware of lock files in this environment.

I think this should be handled by every dev-tool individually.

It's a pipeline of tools, everyone here has a tool, and they all have unique value propositions. I see there being three layers that PMs need to concern themselves with:

Smart contract (solidity) source code (from github, for example)
Blockchain bytecode (deployed to the ethereum mainnet for example)
Client side code that consumes ABIs (in-browser javascript, or python based oracles, for example)

It would be nice if developers could go to a website, maybe in mist, maybe with an address like: 'repos.eth' and search through a collection of repos for 1. that with a specified package manager will provide 2. Presumably, the developer would need another tool in the pipeline, which prepares 1 and 2 such that it is consumable by 3. (we should name this process)

Tool A takes a human readable manifest file from a source code repo and returns a local file which points to the lock file on chain. What does the lock file on-chain contain?

Tool B takes a lock-file on-chain, or a file pointing to a lockfile on-chain, and generates the glue for a given language so that it can interact as a client to the smart-contract from 2. (This requires 1 or ABIs and maybe some other artifacts)

What are the names for Tool A and Tool B?
How does a spec for the lockfile help unless it also specifies the location of the ABIs and a hash to the signed source code commit associated with it? (as mentioned by Peter Vessenes in the MakerDAO audit 2)
Why would the lock file be anything but a binary format?

A. F. Dudley · Answer 21 · Thu Oct 13 2016 00:48:09 GMT+0800 (China Standard Time)

As for 'global' uniqueness:

The model (since there seems to be some confusion):
Single blockchain contains, multiple registry contracts contain, multiple repos contain, multiple packages contain, multiple smart contract source files and a manifest (at minimum, EVM intrinsic tests would be nice too.)

The registry is just ENS (which if I understand correctly approximately provides {"repos.eth": 0xd3adb33f}) This will asserts ENS names will be unique. Package managers handle ENS name resolution.

The repo knows its name, and contains a list of tuples containing:

Unique package name
Owner
URI linking to the human readable manifest
Address of lock file on-chain
(Maybe) Other metadata like last update

Inserts into the repo are rejected on name collision (name/owner mismatch).

If the manifest references other packages, tools will assume those packages are in the same repo. Manifests may specify the expected name of the repo. Users can remove this value from manifest at their own peril.

I think that covers it. Let me know if there is some gap in uniqueness here or some point of centralization, I don't see one.

(This is an MVP spec, I see cross-chain manifests as being a requirement in short order.)

Piper Merriam · Answer 22 · Tue Dec 06 2016 02:04:44 GMT+0800 (China Standard Time)

I think this has been resolved. Closing.