jo1gi / audiobook-dl

Audiobook CLI downloader

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ID3 metadata updates error with characters in the unicode range

Xetera opened this issue · comments

Installation method:
pip install "git+https://github.com/jo1gi/audiobook-dl.git"

Version:
audiobook-dl 0.7.3

Describe the bug
Based on my small research it sounds like the string "Şeker Portakalı" should be writable as id3v2.4 since the v2 spec uses unicode over latin1. But when calling audio.save the code leaves the v1= option as the default (1) which it seems like it attempts to update existing id3v1 tags. I think that might be causing a write failure in this edge case but I haven't really tested it.

audio.save(v2_version=4)

I'm not really sure what the solution to this is. It would be safe to change that to a 0 to get rid of existing v1 tags but there might be extra metadata worth keeping around. Though I'd prefer working metadata with missing custom fields over a broken package.

Command output

╰$ audiobook-dl https://www.storytel.com/tr/books/%C5%9Feker-portakal%C4%B1-1024401 --debug
DEBUG audiobook-dl 0.7.3
DEBUG python 3.11.7 (main, Dec  4 2023, 18:10:11) [Clang 15.0.0 (clang-1500.1.0.2.5)]
 INFO Finding compatible source
 INFO Authenticating with storytel
DEBUG Logging in
DEBUG Downloading result of https://www.storytel.com/tr/books/%C5%9Feker-portakal%C4%B1-1024401
DEBUG download: url='https://www.storytel.com/tr/books/%C5%9Feker-portakal%C4%B1-1024401', list_type='books', language='tr', language2=None
DEBUG URL https://storytel.com/tr/tr/books/şeker-portakalı-1024401
 INFO
 INFO Downloading Şeker Portakalı from storytel
DEBUG Starting downloading file:
https://fastly-ng.storytel.net/mp3encoder-64/c8939764-d6db-479b-b373-05754f0d589f?consumableId=1024401&isbn=9789180127998&type=audio&token=170949
7701_2d585a9e0832d6d24feeacecc52e916bace424e6691d5ef6bf7370eeeb7b975c
  Downloading Şeker Portakalı ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100%
 INFO Adding metadata
Traceback (most recent call last):
  File "/opt/homebrew/bin/audiobook-dl", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/audiobookdl/__main__.py", line 33, in main
    process_url(url, options, config)
  File "/opt/homebrew/lib/python3.11/site-packages/audiobookdl/__main__.py", line 61, in process_url
    process_audiobook(source, result, options)
  File "/opt/homebrew/lib/python3.11/site-packages/audiobookdl/__main__.py", line 173, in process_audiobook
    download(audiobook, options)
  File "/opt/homebrew/lib/python3.11/site-packages/audiobookdl/output/download.py", line 37, in download
    download_audiobook(audiobook, output_dir, options)
  File "/opt/homebrew/lib/python3.11/site-packages/audiobookdl/output/download.py", line 65, in download_audiobook
    add_metadata_to_file(audiobook, filepaths[0], options)
  File "/opt/homebrew/lib/python3.11/site-packages/audiobookdl/output/download.py", line 80, in add_metadata_to_file
    metadata.add_metadata(filepath, audiobook.metadata)
  File "/opt/homebrew/lib/python3.11/site-packages/audiobookdl/output/metadata/__init__.py", line 11, in add_metadata
    id3.add_id3_metadata(filepath, metadata)
  File "/opt/homebrew/lib/python3.11/site-packages/audiobookdl/output/metadata/id3.py", line 80, in add_id3_metadata
    audio.save(v2_version=4)
  File "/opt/homebrew/lib/python3.11/site-packages/mutagen/_util.py", line 156, in wrapper
    return func(self, h, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/mutagen/_file.py", line 132, in save
    return self.tags.save(filething, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/mutagen/_util.py", line 156, in wrapper
    return func(self, h, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/mutagen/easyid3.py", line 198, in save
    self.__id3.save(filething, v1=v1, v2_version=v2_version,
  File "/opt/homebrew/lib/python3.11/site-packages/mutagen/_util.py", line 185, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/mutagen/_util.py", line 156, in wrapper
    return func(self, h, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/mutagen/id3/_file.py", line 260, in save
    data = self._prepare_data(
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/mutagen/id3/_file.py", line 194, in _prepare_data
    framedata = self._write(config)
                ^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/mutagen/id3/_tags.py", line 191, in _write
    framedata = [
                ^
  File "/opt/homebrew/lib/python3.11/site-packages/mutagen/id3/_tags.py", line 192, in <listcomp>
    (f, save_frame(f, config=config)) for f in self.values()]
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/mutagen/id3/_tags.py", line 511, in save_frame
    framedata = frame._writeData(config)
                ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/mutagen/id3/_frames.py", line 211, in _writeData
    writer.write(config, frame, getattr(frame, writer.name)))
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/mutagen/id3/_specs.py", line 575, in write
    return value.encode('latin1') + b'\x00'
           ^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'latin-1' codec can't encode character '\u015f' in position 33: ordinal not in range(256)

After some investigation, it turns out the only problematic part was scrape_url. The PR above fixes my problem though I wasn't able to check why the URL is the only problem