thorn-oss / perception

Most, if not all of functions do not work with non-Latin (Unicode) paths/filenames, on Windows at least.

The reason is that cv2.imread has very poor unicode support of this (opencv/opencv#4292) and they don't have any plan to fix it.

I have to patch it temporarily at

perception/perception/hashers/tools.py

Lines 365 to 371 in e787909

    
           elif isinstance(filepath_or_buffer, str): 
        
               if validators.url(filepath_or_buffer): 
        
                   return read(request.urlopen(filepath_or_buffer, timeout=timeout)) 
        
               if not os.path.isfile(filepath_or_buffer): 
        
                   raise FileNotFoundError('Could not find image at path: ' + 
        
                                           filepath_or_buffer) 
        
               image = cv2.imread(filepath_or_buffer)

with something ugly like this

        with PIL.Image.open(filepath_or_buffer) as im:
            _ = im.convert("RGB")
        return np.array(_)
        # image = cv2.imread(filepath_or_buffer)

# Adopted from above `if PIL is not None and isinstance(filepath_or_buffer, PIL.Image.Image):` case

Because Pillow has much better support with non-Latin paths.

Things like

        image = np.asarray(
            bytearray(open(filepath_or_buffer, "rb").read()), dtype=np.uint8)
        image = cv2.imdecode(image, cv2.IMREAD_UNCHANGED)
# Again, adopted from above `if isinstance(filepath_or_buffer, (io.BytesIO, client.HTTPResponse)):` case

or simply

        image = cv2.imdecode(np.fromfile(filepath_or_buffer, dtype=np.uint8), cv2.IMREAD_UNCHANGED)

could also work.

Wish we can have a proper fix for this.

	elif isinstance(filepath_or_buffer, str):
	if validators.url(filepath_or_buffer):
	return read(request.urlopen(filepath_or_buffer, timeout=timeout))
	if not os.path.isfile(filepath_or_buffer):
	raise FileNotFoundError('Could not find image at path: ' +
	filepath_or_buffer)
	image = cv2.imread(filepath_or_buffer)

Image reading function does not support non-latin paths