Injection error
shc261392 opened this issue · comments
Image https://drive.google.com/file/d/17DgQCF-TOUEk9LaroHYa9OPaGhXw6soQ/view?usp=sharing
Image sha256sum: 55146f1665fa84fe2a76d13772f7f83ea02a188cde68a047cb9acd2e28005d90
$ git checkout 72735b
$ mv <downloaded-image> dog.jpeg
$ cp dog.jpeg dog-thumbnail.jpeg
$ python3 utils/starling_multiple_injection.py dog.jpeg 15:41:23 Traceback (most recent call last): File "/Users/shc/numbers/github/starling-cai/utils/starling_multiple_injection.py", line 166, in <module> starling = Starling(photo_bytes,
File "/Users/shc/numbers/github/starling-cai/cai/starling.py", line 74, in __init__
self.app11_headers = get_app11_marker_segment_headers(self.raw_bytes)
File "/Users/shc/numbers/github/starling-cai/cai/jumbf.py", line 219, in get_app11_marker_segment_headers
header['tbox'] = data_bytes[offset + 16 : offset + 20].decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 3: unexpected end of data
Root Cause
CAI module treats non-CAI data as CAI metadata and tries to parse it.
Analysis
Currently, the CAI module finds the CAI metadata (APP11 Marker Segments if more precisely) only by searching the 0xFFEB
which represents the APP11 Marker.
Under this condition, any data identical to 0xFFEB
will be treated as the beginning of a CAI metadata.
Solution
Method 1: Workaround (I will go in this way because of resource constraint)
Checking both the App11 Marker and the CI
parameter can be a quick workaround. It's a workaround because it only reduces the probability to treat non-CAI data as CAI metadata.
Method 2: Root cause solution
To my best knowledge, to fix this issue completely, we need to find all the starting points of the Marker Segments between SOI
and DQT
and only parse the APP11 Marker Segments.
Testing image information
$ file scott.jpg
scott.jpg: JPEG image data, Exif standard: [TIFF image data, little-endian, direntries=12, height=3024, manufacturer=samsung, model=SM-N9810, orientation=upper-right, xresolution=210, yresolution=218, resolutionunit=2, software=N9810ZSU1ATI4, datetime=2020:10:24 15:03:33, width=4032], baseline, precision 8, 4032x3024, components 3
The workaround seems to work (although with known issue #15)
$ sha256sum scott*
47e148074e9a3f658119c82e1a2e5aebb148a2a3864f6a1e4d1f58a4bd31a0ee scott-cai-cai-cai.jpg
55146f1665fa84fe2a76d13772f7f83ea02a188cde68a047cb9acd2e28005d90 scott.jpg
bfd0c280dfa195a0e8468a0f0d1d6beecb652a70cf591c19187d0c5166cef6a8 scott-thumbnail.jpg