Duplicate Map Entries get "squashed" to the last entry in Yaml src
rbuckland opened this issue · comments
There is unexpected behaviour when dealing with duplicate maps inside a yaml file.
foo:
value: 1
foo:
value: 2
will be valid, and will only have a {foo:{value:2}}
map (no error).
The expected behaviour would be to have an exception // parsing error of some form.
The underlying behaviour is in snakeyaml, and it tracked and documented well here https://bitbucket.org/asomov/snakeyaml/issues/337/option-to-disallow-duplicate-keys.
- I have already patched
snakeyaml
https://bitbucket.org/asomov/snakeyaml/pull-requests/10 - I have also forked and patched
moultingyaml
at this commit rbuckland@08b41af
This issue is needed to track the outcome of the PR to snakeyaml
.
I think the parseYaml
and parseYamls
methods of MoultingYAML should use the default BaseConstructor
of SnakeYAML if called without arguments. I'd more inclined to change the signature of parseYaml
and parseYamls
to something that allows customizing the SnakeYAML BaseConstructor
, i.e.:
def parseYaml(baseConstructor: BaseConstructor = new Constructor()) = ???
def parseYamls(baseConstructor: BaseConstructor = new Constructor()) = ???
👍 - makes sense to change the signature
regarding which constructor (or better put - the default behaviour)
I maybe slightly green on understanding the YAML specifications but it occurs to me that "accepting" duplicate keys (in 1.1) is not good.. and in 1.2 is a MUST NOT allow duplicates.
I could be wrong, but it leads to the "default" behaviour to be
NOT_ALLOW_DUPLICATES
and error on load if they are found.
In code that I am writing for snakeyaml
now ..
* YAML 1.1 is slightly vague around duplicate entries in the YAML file.
* The best reference is <a href="http://www.yaml.org/spec/1.1/#id862121">
* 3.2.1.3. Nodes Comparison</a> where it hints that a duplicate map key is
* an error.
*
* For future reference, YAML spec 1.2 is clear. The keys MUST be unique.
* <a href="http://www.yaml.org/spec/1.2/spec.html#id2759572">1.3. Relation
* to JSON</a>
I am about to push a PR to snakeyaml that introduces a config object to the constructor, instead of this overloaded method.
I have reworked the change in snakeyaml and then reworked the impact to my fork.
https://github.com/rbuckland/moultingyaml
-
snakeyaml now takes an optional LoadingConfig (similar to DumperOptions)
- it added
setAllowDuplicateKeys(boolean allowDuplicateKeys)
- it added
-
this loadingConfig is supplied via an implicit
implicit class PimpedString(val string: String) extends AnyVal { def parseYaml()(implicit loadingConfig: LoadingConfig = new LoadingConfig()): YamlValue = convertToYamlValue(new Yaml(loadingConfig).load(string)) def parseYamls()(implicit loadingConfig: LoadingConfig = new LoadingConfig()): Seq[YamlValue] = new Yaml(loadingConfig).loadAll(string).asScala.map(convertToYamlValue).toSeq }
implementation is simply
implicit val loadingConfig = new LoadingConfig() loadingConfig.setAllowDuplicateKeys(false) ... ... parseYamls...
-
once snakeyaml PR is reviewed / approved and released, I will create the PR from my fork
If SnakeYAML is to provide a way to configure the way YAMLs are parsed, I'd rather have the configuration be done explicitly in the parseYaml(s)
methods. Also, I wouldn't expose a SnakeYAML object (i.e. LoadingConfig
), but rather have a MoultingYAML interface to avoid exposing the underlying implementation (the configurable print method is implemented that way).