toml-lang / toml

Tom's Obvious, Minimal Language

Home Page:https://toml.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Feature request: table alias

mcrumiller opened this issue · comments

I commonly have long-ish table names that represent, say, a database table, and I define fields within:

[long_table_name_that_starts_with_a]
[long_table_name_that_starts_with_a.field1]
null = true
options = ["a", "b", "c"]

[long_table_name_that_starts_with_a.field2]
null = false
options = [1, 2, 3]

This is fairly unreadable, and it would be nice to alias tables, perhaps with the following syntax, or something similar

[long_table_name_that_starts_with_a]: a
[a.field1]
null = true
options = ["a", "b", "c"]

[a.field2]
null = false
options = [1, 2, 3]

I like this idea, but it raises some questions.

  • This suggests that aliases exist only within the TOML document, not in the configuration produced by a parser. For instance, your example would have no table named "a" in its config, just a table named "long_table_name_that_starts_with_a" and its two subtables field1 and field2. Is this the case? Or would there be some method accompanying the configuration that would link directly to the aliased tables?
  • How would an alias work inside of an array of tables? Would [[array-of-tables]]: aot make the alias aot refer to the array, or just to this particular element? Or could it be used for all subsequent elements of the array?
  • Likewise, if an alias were to be defined for a subtable of an element of an array of tables, would it exist for the span of that element? Or would its use suggest that subsequent elements would share the same structure as soon as the alias is referenced in a new element table? This approach is way too complicated, so I may insist that aliases could not be used inside element tables.
  • Could we use a different syntax to indicate the alias, so we can quickly tell it apart from a regular table name? Something like [@a = long_table_name_that_starts_with_a], which would have subtables [@a.field1] and [@a.field2]? I would suggest doing something like this, because we have always assumed that table names are absolute references, and aliases shouldn't change that.

I'm not quite convinced. In this case, where the alias is just an alternative to the long name, why not use it instead of the long name in the first place?

@ChristianSi even though I made this request I'm leaning towards your point of view as well now. One could workaround the issue without complicating the language with the following:

[a]
name = "long_table_name_that_starts_with_a"

[a.field1]
null = true
options = ["a", "b", "c"]

[a.field2]
null = false
options = [1, 2, 3]

I'll leave this open for one more day and, without more support, I'll close it.

Thank you for posing this question. I'd like to revisit it someday, but I can open a new issue if necessary.

I could see this being useful if you have an elaborate project config. In the Python world, it's possible to configure certain development tools within pyproject.toml, though some prefer not to. Nevertheless, imagine the following simple example where a well-chosen alias can prove useful and add context to a config without adding comments.

[@ruff = tool.ruff]
extend-include = ["*.ipynb"]
line-length = 100
python-version = "3.8"

[@ruff.lint]
extend-select = ["B", "F", "I"]

I would prefer a way to refer to the parent table, for example with &:

[long_table_name_that_starts_with_a]
[&.field1]
null = true
options = ["a", "b", "c"]

[&.field2]
null = false
options = [1, 2, 3]

This has been discussed before though, and was rejected. But that's a lot easier to use, because you no longer need to map table aliases to the actual names. I dislike aliases in things like SQL. It's probably also easier to implement in most cases.

I think we need to wait until after 1.1 and newlines in inline tables, because what I suspect is that a lot of people will write it like:

[long_table_name_that_starts_with_a]
field1 = {
    null    = true,
    options = ["a", "b", "c"],
}

field2 = {
    null    = false,
    options = [1, 2, 3],
}

Or at least, that's how I would write it.

So that would also solve the problem, at least for this example.

@arp242 Wow, you're describing a localized alias system that's suddenly brilliant in its simplicity! I was against the [&] approach when it was mentioned a long time ago, but now it makes perfect sense. Let me wrap my head around this if I may bloviate a bit...

We don't need to name aliases for this to work! We start with a table section with key/value pairs defined the usual way, e.g. [long-table-name], then we introduce a new section with [&.subtable1]. That turns & into an alias, much like how @mcrumiller defined it, but with no need for a special name. This alias is absolute, too, so that [&.subtable2] would refer to another subtable of long-table-name on the same level as long-table-name.subtable1.

For as long as each table name begins with an &, the alias will remain the same. And once a table is defined without the leading &, e.g. [another-table], the alias is gone. And this would work in the same way with arrays of tables, e.g. [[&.aot-inside-the-other-table]].

Super quick rehash: With the examples above, we would immediately know that:

  • long-table-name is a table,
  • long-table-name.subtable1 is a subtable,
  • long-table-name.subtable2 is also a subtable,
  • another-table is a table on the same level as long-table-name,
  • and another-table.aot-inside-the-other-table is an array of tables in another-table.

The elegance is beautiful, and I'm amazed and ashamed that it took me so long to realize it.

There could be a variant that makes aliasing an explicit practice. My @ruff example from before would look like this:

[@tool.ruff]  # Note the additional `@`-sign.
extend-include = ["*.ipynb"]
line-length = 100
python-version = "3.8"

[@.lint]  # And now we can use the `@` on its own, with a period and subtable names.
extend-select = ["B", "F", "I"]

These approaches would be extremely reliant on the order of tables. But syntax highlighters could fade table names a little that use alias symbols. (Though the symbols would appear as prominent as the absolute names.) And for the use case described here, it would allow for a separation of concerns. Imagine IDEs permitting folding to conceal entire sets of tables that use the same alias. Pretty cool, hypothetically!

And it would boil away another need for bloated inline tables. That would make several people happy. That would make me happy!

@mcrumiller How would this compare with your suggestion? What do you think about all this? I know you decided to simply use shorter names, but users won't always have the option to change configuration keys quickly. What do you think?

Yes this is absolutely fine and I love the suggestion.

I like @arp242's suggestion to use & (or maybe another symbol) as placeholder for "the last table name element used in this position". @eksortso's "variant" with explicit @ before the aliased name seems a bit confusing to me, though I'm sure one could get used to it as well.

Previous discussion on that: #593, #744.

Although I'm not strictly against it, personally I wouldn't be in favour of adding that though. "I would prefer that" meant "I would prefer having & over aliases", rather than "I would prefer if this gets added".

Like I said, I suspect multi-line inline tables should solve things fairly well for most use cases.

Over on #1019, I make a case that the use of ellipses "..." would make better sense than a newly imbued character like & or @.
Thanks to @levicki for this suggestion.

Like I said, I suspect multi-line inline tables should solve things fairly well for most use cases.

Inline literally means "arranged in a single line".

Therefore, there's no such thing as a "multi-line inline table" — tables can be multi-line or inline, not both.

Inline literally means "arranged in a single line".

And "inline" has a second meaning in English. Quoting from one of the Oxford dictionaries, it also means "constituting an integral part of a continuous sequence of operations or machines". In TOML, it's clear that this is what is meant, because inline tables constitute a continuous block of text, between curly braces, as a table value.

In fact, within an array, you could define two inline tables on a single line if you wanted. Nobody does that, but sensibility is hard to codify, and our minimal aesthetic compels us to avoid arbitrary restrictions. Our users can be more rigid than we allow them to be, and typically for their own sakes, they are.

Anyway, table name aliasing and/or abbreviation play no part in how inline tables work, either ideally or practically. Wouldn't you prefer to expound upon the use of ellipses, or comment on any of the other proposals here? Or whether or not this exercise is worthy of our time, which I have assumed it is?

And "inline" has a second meaning in English. Quoting from one of the Oxford dictionaries, it also means "constituting an integral part of a continuous sequence of operations or machines". In TOML, it's clear that this is what is meant, because inline tables constitute a continuous block of text, between curly braces, as a table value.

There, you said it yourself — continuous as in unbroken using say... line breaks?

Or are you now going to look for second meaning of continuous as well?

Intention when TOML designed was clear:

  1. There are two flavors of tables — normal and inline
  2. Inline is meant to be used within arrays and as such it wasn't meant to be broken into multiple lines but to represent one entry per line in an array.

It's really that simple.

Anyway, table name aliasing and/or abbreviation play no part in how inline tables work, either ideally or practically. Wouldn't you prefer to expound upon the use of ellipses, or comment on any of the other proposals here? Or whether or not this exercise is worthy of our time, which I have assumed it is?

I proposed ellipsis syntax as a compromise so that inline tables remain as they are without allowing line breaks. Shortening path using that syntax should make it easy enough for people to work with deep nesting without needing inline table change.

I already made my case on why I think inline tables shouldn't be touched, and I also expounded on the ellipsis idea and why allowing arbitrary path depth instead of just previous array wouldn't work.

I don't think there's much more I can say without risk of repeating myself, I do hope that sanity will prevail when it comes to inline tables though.

If you are still unclear on the real definition of inline when it comes to layouts (because this is defining table layout), please look at how it is used in the CSS display property.

In short, when a HTML element is set to display inline, any height and width properties on it will have no effect.

Inserting line breaks into an inline table in TOML is therefore akin to trying to change the height of inline HTML element and to me it's perfectly clear why it wasn't allowed.

unbroken using say

@levicki. You know damned well what I'm talking about. I'm getting sick of this abuse. I should have spoken up more loudly when you blindsided me in #1019. And I regret having engaged with you as much as I have.

So write a PR to revert what you don't like about changes to inline tables that were merged years ago after many more years of discussion (#516, #781, #904, and elsewhere). You will have free reign to make your case in your own thread. You just may come out of this a hero in the end. If you do it the right way.

You know damned well what I'm talking about. I'm getting sick of this abuse. I should have spoken up more loudly when you blindsided me in #1019.

You have already complained loudly about that comment, can't you find something new to complain about?

And I regret having engaged with you as much as I have.

If you haven't, then you wouldn't have gotten the ellipsis proposal from me.

So write a PR to revert what you don't like about changes to inline tables that were merged years ago after many more years of discussion (#516, #781, #904, and elsewhere).

I am not going to do that.

People who agreed to change it in the first place should realize their mistake and should be the ones reverting it.

You will have free reign to make your case in your own thread.

The only reason I mentioned it here is that most of those cases you are linking to were built around long table paths being inconvenient and / or less readable and harder to write. Having ellipsis (or whatever other alias gets adopted) would eliminate the need for messing with inline tables to begin with. Aside from that complaint which ellipsis would alleviate, all other arguments for allowing line breaks ring hollow.

You just may come out of this a hero in the end. If you do it the right way.

I am not looking to be a hero, just to protect TOML from a nonsensical change which clashes with the meaning of "inline" in dictionary and computing, and encourages hideous nesting where format would force you to flatten the data by using longer table paths.

I could support either ellipses ([...subtable]) or ampersand ([&.subtable]) for table aliases, but only one of these. My preference is now with &. You can ignore my @ proposal.

But unless we hear an outpouring of support for table aliases now, let's save the prospect for table aliasing after the release of TOML v1.1.0rc1. We can see what users prefer and act accordingly.

Also, @pradyunsg, my apologies for blowing up the way I did in this issue. I will try to handle heated exchanges with a bit more grace in the future.