Image reading function does not support non-latin paths
fireattack opened this issue · comments
Most, if not all of functions do not work with non-Latin (Unicode) paths/filenames, on Windows at least.
The reason is that cv2.imread has very poor unicode support of this (opencv/opencv#4292) and they don't have any plan to fix it.
I have to patch it temporarily at
perception/perception/hashers/tools.py
Lines 365 to 371 in e787909
with something ugly like this
with PIL.Image.open(filepath_or_buffer) as im:
_ = im.convert("RGB")
return np.array(_)
# image = cv2.imread(filepath_or_buffer)
# Adopted from above `if PIL is not None and isinstance(filepath_or_buffer, PIL.Image.Image):` case
Because Pillow has much better support with non-Latin paths.
Things like
image = np.asarray(
bytearray(open(filepath_or_buffer, "rb").read()), dtype=np.uint8)
image = cv2.imdecode(image, cv2.IMREAD_UNCHANGED)
# Again, adopted from above `if isinstance(filepath_or_buffer, (io.BytesIO, client.HTTPResponse)):` case
or simply
image = cv2.imdecode(np.fromfile(filepath_or_buffer, dtype=np.uint8), cv2.IMREAD_UNCHANGED)
could also work.
Wish we can have a proper fix for this.