fabiocaccamo / python-benedict

:blue_book: dict subclass with keylist/keypath support, built-in I/O operations (base64, csv, html, ini, json, pickle, plist, query-string, toml, xls, xml, yaml), s3 support and many utilities.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Optionally allow `keypath_separator` in input dict keys (`unflatten`).

brynpickering opened this issue · comments

We're loading in YAML files where we allow the dot notation in place of nested keys. Therefore, the keypath_separator is . and I'd like that to remain the separator in the resulting benedict.

Therefore: foo.bar: 1 should be loaded in as {'foo': {'bar': 1}} and then be accessible as my_dict["foo.bar"] and my_dict["foo"]["bar"].

Interestingly, in the currently benedict implementation, I can make this happen for top-level keys containing the keypath separator, as follows:

[In] 
my_dict = {"foo.bar": 1}
my_benedict = benedict()
my_benedict.merge(my_dict)
my_benedict

[Out] 
{'foo': {'bar': 1}}

But I get the (expected) error in these two cases:

benedict({"foo.bar": 1})
my_dict = {"baz": {"foo.bar": 1}}
my_benedict = benedict()
my_benedict.merge(my_dict)

python-benedict version: 0.33.1

Upvote & Fund

  • We're using Polar.sh so you can upvote and help fund this issue.
  • We receive the funding once the issue is completed & confirmed by you.
  • Thank you in advance for helping prioritize & fund our backlog.
Fund with Polar

@brynpickering thanks for reporting this issue.

Correct me if I'm wrong, this is a problem of the merge method right?

Also instatiation. I want benedict({"baz": {"foo.bar": 1}}, keypath_separator=".") to work and to be represented internally as {"baz": {"foo": {"bar": 1}}}.

The second part of my issue is more of a bug based on the existing expected functionality, but is in the correct direction for the feature I would like 😅

@brynpickering with d79336d the behaviour of the merge method has been fixed, now if the input dict has keys containing keypath_separator a ValueError is raised (intended behaviour).

So, with this fix, this issue becomes a feature request instead of a bug report (correct me if I'm wrong).

Yes, that's correct.

@brynpickering I thought about this as feature, surely it can't be the default behaviour, because it would be "invisibly backward incompatible" when using this library with existing dicts containing the keypath_separator in some keys.

For achieving what you need, you can simply use the flatten / unflatten methods:

Write

d = benedict()
# ...
f = d.flatten(separator="__")
f.to_yaml(filepath="my-dict.yml")

Read

f = benedict.from_yaml("my-dict.yml")
d = f.unflatten(separator="__")

I had a look at unflatten. I think the approach would need to be:

f = benedict.from_yaml("my-dict.yml", keypath_separator=None)
d = benedict(f.unflatten(separator="."))  # must wrap to be able to set and get keys with the "." separator 

It's not particularly pretty and potentially involves creating the same representation of data in memory three times (I haven't looked into exactly what goes on under the hood with the data when calling unflatten and benedict).

You're right that it would be silently backward incompatible, but it seems more intuitive to me that defining the keypath separator on loading data is equivalent to telling benedict that any keys using that separator are meant to be separated.

Perhaps a flag could be added whenever keypath_separator is available as an argument like separate_on_separator, separate, or unflatten. If True, it would separate any keys using the separator into a nested dict. Then it can default to False to achieve the current behaviour.