Path.extract can't handle double backslashes
HiddeLekanne opened this issue · comments
Describe the bug
deepdiff.path.extract can't handle double backslashes "\\". It will still use the second backslash as unicode together with the character after that.
To Reproduce
from deepdiff import grep, extract
obj = ["something somewhere", {"abc\\bTHIS_b_CANT_BE_HERE": "somewhere", "string": 2, 0: 0, "somewhere": "around"}]
item = "somewhere"
ds = obj | grep(item)
for path in ds["matched_values"]:
print(extract(obj, path))
This will result in an error:
Traceback (most recent call last):
File "...\test.py", line 11, in <module>
print(extract(obj, path))
File "...\venv\lib\site-packages\deepdiff\path.py", line 169, in extract
return _get_nested_obj(obj, elements)
File "...\venv\lib\site-packages\deepdiff\path.py", line 108, in _get_nested_obj
obj = obj[elem]
KeyError: 'abc\x08THIS_b_CANT_BE_HERE'
something somewhere
Expected behavior
I expect extract to be able to handle a "\\" in my keys of a dictionary.
OS, DeepDiff version and Python version (please complete the following information):
- OS: [Windows]
- Version [10]
- Python Version [3.7]
- DeepDiff Version [6.3]
Additional context
Fix could be:
for char in path:
if prev_char == '\\':
if char != '\\': # Treat "\\" as a single escape character
elem += '\\'
elem += char
Instead of the current:
for char in path:
if prev_char == '\\':
elem += char
Hi @HiddeLekanne
Thanks for reporting the issue. I will keep this in mind for the next release. PRs are very welcome too!
I would love to do that sometime, but I am really unsure about the design requirements. For the purpose of this bug report I assumed that there is an encode and decode relationship between path and extract. To this extend I don't understand why the encode would even try to interpret backslashes (and other python string features) in the first place.
In short; Why are we not working with raw python strings? (r'string')
So a bugfix would be either, add a bunch of if statements to account for the python string features, in order to reverse them. Or start working in raw python strings. I wouldn't know what kind of solution you would prefer.
Hi @HiddeLekanne
This is fixed in the recent DeepDiff releases.