fabiocaccamo / python-benedict

:blue_book: dict subclass with keylist/keypath support, built-in I/O operations (base64, csv, html, ini, json, pickle, plist, query-string, toml, xls, xml, yaml), s3 support and many utilities.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

s3 working as intended? Catch-22 for S3 read

eric-tramel opened this issue · comments

Python version
3.9.15

Package version
0.27.1

Current behavior (bug description)

I'd like to use the s3 functionality to load a JSON from S3. However, I'm getting into a strange catch-22 situation with the kwarg s3_options. Following the README.md, the following pattern "should" work?

import os
from benedict import benedict

creds = benedict(
    os.path.expanduser("~/.aws/credentials"), 
    format="ini"
)

d = benedict(
    "s3://bucket/path/to/some.json", 
    s3_options=creds["default"]
)

However, when running the above, the following error is returned.

  File "/Users/eritrame/src/facteur2/test.py", line 6, in <module>
    d = benedict("s3://unlearnai/eric-scratch/ad.json", s3_options=creds["default"])
  File "/Users/eritrame/.pyenv/versions/miniconda3-latest/envs/facteur2-dev/lib/python3.9/site-packages/benedict/dicts/__init__.py", line 52, in __init__
    super(benedict, self).__init__(*args, **kwargs)
  File "/Users/eritrame/.pyenv/versions/miniconda3-latest/envs/facteur2-dev/lib/python3.9/site-packages/benedict/dicts/keypath/keypath_dict.py", line 14, in __init__
    super(KeypathDict, self).__init__(*args, **kwargs)
  File "/Users/eritrame/.pyenv/versions/miniconda3-latest/envs/facteur2-dev/lib/python3.9/site-packages/benedict/dicts/keylist/keylist_dict.py", line 10, in __init__
    super(KeylistDict, self).__init__(*args, **kwargs)
  File "/Users/eritrame/.pyenv/versions/miniconda3-latest/envs/facteur2-dev/lib/python3.9/site-packages/benedict/dicts/io/io_dict.py", line 16, in __init__
    d = IODict._decode_init(args[0], **kwargs)
  File "/Users/eritrame/.pyenv/versions/miniconda3-latest/envs/facteur2-dev/lib/python3.9/site-packages/benedict/dicts/io/io_dict.py", line 27, in _decode_init
    return IODict._decode(s, format, **kwargs)
  File "/Users/eritrame/.pyenv/versions/miniconda3-latest/envs/facteur2-dev/lib/python3.9/site-packages/benedict/dicts/io/io_dict.py", line 35, in _decode
    raise ValueError(f"Invalid data or url or filepath argument: {s}\n{e}")
ValueError: Invalid data or url or filepath argument: s3://unlearnai/eric-scratch/ad.json
__init__() got an unexpected keyword argument 's3_options'

Okay, strange -- but what happens if we don't supply AWS credentials for s3 access.

Traceback (most recent call last):
  File "/Users/eritrame/src/facteur2/test.py", line 9, in <module>
    d = benedict(
  File "/Users/eritrame/.pyenv/versions/miniconda3-latest/envs/facteur2-dev/lib/python3.9/site-packages/benedict/dicts/__init__.py", line 52, in __init__
    super(benedict, self).__init__(*args, **kwargs)
  File "/Users/eritrame/.pyenv/versions/miniconda3-latest/envs/facteur2-dev/lib/python3.9/site-packages/benedict/dicts/keypath/keypath_dict.py", line 14, in __init__
    super(KeypathDict, self).__init__(*args, **kwargs)
  File "/Users/eritrame/.pyenv/versions/miniconda3-latest/envs/facteur2-dev/lib/python3.9/site-packages/benedict/dicts/keylist/keylist_dict.py", line 10, in __init__
    super(KeylistDict, self).__init__(*args, **kwargs)
  File "/Users/eritrame/.pyenv/versions/miniconda3-latest/envs/facteur2-dev/lib/python3.9/site-packages/benedict/dicts/io/io_dict.py", line 16, in __init__
    d = IODict._decode_init(args[0], **kwargs)
  File "/Users/eritrame/.pyenv/versions/miniconda3-latest/envs/facteur2-dev/lib/python3.9/site-packages/benedict/dicts/io/io_dict.py", line 27, in _decode_init
    return IODict._decode(s, format, **kwargs)
  File "/Users/eritrame/.pyenv/versions/miniconda3-latest/envs/facteur2-dev/lib/python3.9/site-packages/benedict/dicts/io/io_dict.py", line 35, in _decode
    raise ValueError(f"Invalid data or url or filepath argument: {s}\n{e}")
ValueError: Invalid data or url or filepath argument: s3://unlearnai/eric-scratch/ad.json
read_content_from_s3() missing 1 required positional argument: 's3_options'

So it does expect for s3_options to be present, but if its present, it doesn't work.

Going deeper, it seems like the call below (in IODict)

    @staticmethod
    def _decode(s, format, **kwargs):
        data = None
        try:
            data = io_util.decode(s, format, **kwargs)
        except Exception as e:
            raise ValueError(f"Invalid data or url or filepath argument: {s}\n{e}")

is passing the full set of **kwargs forward to the deserializer...however, it should first remove the benedict-specific s3_options ? Hence, the JSON deserialization is dying inside of io_util.decode, but the generic Exception catch is swallowing it and reporting an assumed invalid data path.

Expected behavior
It should be able to load the JSON as a dict.

@eric-tramel fixed in 0.28.0 version.