tubearchivist / tubearchivist

Your self hosted YouTube media server

Home Page:https://www.tubearchivist.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Bug]: Auto Caption on YT, allthough status code 200 doesn't return anything

lukyjay opened this issue · comments

I've read the documentation

Operating System

Linux (Docker on Debian LXC in Proxmox)

Your Bug Report

Describe the bug

I've had no issues using TA to download thousands of videos but all of a sudden I get an error: Expecting value: line 1 column 1 (char 0)

No matter which video in my queue I press to download, it seems to try downloading https://www.youtube.com/watch?v=W1Qyv6EI2hE and results in this error. (that video is at the top of my queue)

I have attempted restarting all services and updating them all. ES and Redis containers have no errors. My cookies are up to date.

I went into the settings and deleted the queue and it started working again, but once it reaches that video it suddenly presents the same error. Strange! Seems this one video causes some issues? W1Qyv6EI2hE

Steps To Reproduce

Go to Downloads screen. Either start downloads or press download now next to any video in queue.

I noticed mention of subtitle in the log. I do have subtitles set to "en, auto and true" in the settings.

Expected behavior

Videos download

Relevant log output

230 static files copied to '/app/staticfiles', 974 post-processed.


                         ....  .....
                  ...'',;:cc,. .;::;;,'...
               ..,;:cccllclc,  .:ccllllcc;,..
            ..,:cllcc:;,'.',.  ....'',;ccllc:,..
          ..;cllc:,'..                ...,:cccc:'.
         .;cccc;..                        ..,:ccc:'.
       .ckkkOkxollllllllllllc.      .,:::;.  .,cclc;
      .:0MMMMMMMMMMMMMMMMMMMX:     .cNMMMWx.   .;clc:
     .;lOXK0000KNMMMMX00000KO;     ;KMMMMMNl.   .;ccl:,.
     .;:c:'.....kMMMNo........    'OMMMWMMMK:    '::;;'.
   .......     .xMMMNl           .dWMMXdOMMMO'   ........
   .:cc:;.     .xMMMNc          .lNMMNo.:XMMWx.    .:cl:.
   .:llc,.     .:xxxd,          ;KMMMk. .oWMMNl.   .:llc'
   .cll:.     .;:;;:::,.       'OMMMK:';''kWMMK:   .;llc,
   .cll:.     .,;;;;;;,.     .,xWMMNl.:l:.;KMMMO'  .;llc'
   .:llc.      .cOOOk;      .lKNMMWx..:l:..lNMMWx. .:llc'
   .;lcc,.     .xMMMNc      :KMMMM0, .:lc. .xWMMNl.'ccl:.
    .cllc.     .xMMMNc     'OMMMMXc...:lc...,0MMMKl:lcc,.
    .,ccl:.    .xMMMNc    .xWMMMWo.,;;:lc;;;.cXMMMXdcc;.
     .,clc:.   .xMMMNc   .lNMMMWk. .':clc:,. .dWMMW0o;.
      .,clcc,. .ckkkx;   .okkkOx,    .';,.    'kKKK0l.
       .':lcc:'.....      .  ..            ..,;cllc,.
         .,cclc,....                     ....;clc;..
          ..,:,..,c:'..              ...';:,..,:,.
            ....:lcccc:;,'''.....'',;;:clllc,....
               .'',;:cllllllccccclllllcc:,'..
                   ...'',,;;;;;;;;;,''...
                            .....


#######################
#  Environment Setup  #
#######################

[1] checking expected env vars
    ✓ all expected env vars are set
[2] check ES user overwrite
    ✓ ES user is set to elastic
[3] check TA_PORT overwrite
    TA_PORT is not set
[4] check TA_UWSGI_PORT overwrite
    TA_UWSGI_PORT is not set
[5] check ENABLE_CAST overwrite
    ENABLE_CAST is not set
[6] create superuser
    superuser already created


#######################
#  Connection check   #
#######################

[1] connect to Redis
    ✓ Redis connection verified
[2] set Redis config
    ✓ Redis config set
[3] connect to Elastic Search
    ... waiting for ES [0/24]
    ✓ ES connection established
[4] Elastic Search version check
    ✓ ES version check passed
[5] check ES path.repo env var
    ✓ path.repo env var is set


#######################
#  Application Start  #
#######################

[1] set new config.json values
    ✓ new config values set
[2] create expected cache folders
    ✓ expected folders created
[3] clear leftover keys in redis
    no keys found
[4] clear task leftovers
[5] clear leftover files from dl cache
clear download cache
    ✓ cleared 1 files
[6] check for first run after update
    no new update found
[MIGRATION] validate index mappings
ta_config index is created and up to date...
ta_channel index is created and up to date...
ta_video index is created and up to date...
ta_download index is created and up to date...
ta_playlist index is created and up to date...
ta_subtitle index is created and up to date...
ta_comment index is created and up to date...
[MIGRATION] setup snapshots
snapshot: run setup
snapshot: repo ta_snapshot already created
snapshot: policy is set.
snapshot: last snapshot is up-to-date
[MIGRATION] move user configuration to ES
    ✓ Settings for user '1' migrated to ES
    ✓ Settings for all users migrated to ES


########################
# Filesystem Migration #
########################

    no channel migration needed
[uWSGI] getting INI configuration from uwsgi.ini
*** Starting uWSGI 2.0.23 (64bit) on [Sun Feb 11 12:17:03 2024] ***
compiled with version: 10.2.1 20210110 on 27 January 2024 04:04:41
os: Linux-6.5.11-8-pve #1 SMP PREEMPT_DYNAMIC PMX 6.5.11-8 (2024-01-30T12:27Z)
nodename: c92ab0fed943
machine: x86_64
clock source: unix
detected number of CPU cores: 16
current working directory: /app
writing pidfile to /tmp/project-master.pid
detected binary path: /root/.local/bin/uwsgi
!!! no internal routing support, rebuild with pcre support !!!
uWSGI running as root, you can use --uid/--gid/--chroot options
*** WARNING: you are running uWSGI as root !!! (use the --uid flag) *** 
your memory page size is 4096 bytes
detected max file descriptor number: 1048576
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to TCP address :8080 fd 3
uWSGI running as root, you can use --uid/--gid/--chroot options
*** WARNING: you are running uWSGI as root !!! (use the --uid flag) *** 
Python version: 3.11.3 (main, May 23 2023, 13:34:03) [GCC 10.2.1 20210110]
*** Python threads support is disabled. You can enable it with --enable-threads ***
Python main interpreter initialized at 0x7f97f59cf558
uWSGI running as root, you can use --uid/--gid/--chroot options
*** WARNING: you are running uWSGI as root !!! (use the --uid flag) *** 
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 154032 bytes (150 KB) for 1 cores
*** Operational MODE: single process ***
celery beat v5.3.6 (emerald-rush) is starting.
WSGI app 0 (mountpoint='') ready in 0 seconds on interpreter 0x7f97f59cf558 pid: 28 (default app)
uWSGI running as root, you can use --uid/--gid/--chroot options
*** WARNING: you are running uWSGI as root !!! (use the --uid flag) *** 
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 28)
spawned uWSGI worker 1 (pid: 52, cores: 1)
/root/.local/lib/python3.11/site-packages/celery/platforms.py:829: SecurityWarning: You're running the worker with superuser privileges: this is
absolutely not recommended!

Please specify a different user using the --uid option.

User information: uid=0 euid=0 gid=0 egid=0

  warnings.warn(SecurityWarning(ROOT_DISCOURAGED.format(
__    -    ... __   -        _
LocalTime -> 2024-02-11 12:17:04
Configuration ->
    . broker -> redis://archivist-redis:6379//
    . loader -> celery.loaders.app.AppLoader
    . scheduler -> celery.beat.PersistentScheduler
    . db -> /celerybeat-schedule
    . logfile -> [stderr]@%INFO
    . maxinterval -> 5.00 minutes (300s)
[2024-02-11 12:17:04,067: INFO/MainProcess] beat: Starting...
[2024-02-11 12:17:04,087: INFO/MainProcess] Scheduler: Sending due task schedule_version_check (version_check)
 
 -------------- celery@c92ab0fed943 v5.3.6 (emerald-rush)
--- ***** ----- 
-- ******* ---- Linux-6.5.11-8-pve-x86_64-with-glibc2.31 2024-02-11 12:17:04
- *** --- * --- 
- ** ---------- [config]
- ** ---------- .> app:         tasks:0x7fd92fc70c50
- ** ---------- .> transport:   redis://archivist-redis:6379//
- ** ---------- .> results:     redis://archivist-redis:6379/
- *** --- * --- .> concurrency: 16 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** ----- 
 -------------- [queues]
                .> celery           exchange=celery(direct) key=celery
                

[tasks]
  . check_reindex
  . download_pending
  . extract_download
  . index_playlists
  . manual_import
  . rescan_filesystem
  . restore_backup
  . resync_thumbs
  . run_backup
  . subscribe_to
  . thumbnail_check
  . update_subscribed
  . version_check

[2024-02-11 12:17:05,538: WARNING/MainProcess] /root/.local/lib/python3.11/site-packages/celery/worker/consumer/consumer.py:507: CPendingDeprecationWarning: The broker_connection_retry configuration setting will no longer determine
whether broker connection retries are made during startup in Celery 6.0 and above.
If you wish to retain the existing behavior for retrying connections on startup,
you should set broker_connection_retry_on_startup to True.
  warnings.warn(

[2024-02-11 12:17:05,544: INFO/MainProcess] Connected to redis://archivist-redis:6379//
[2024-02-11 12:17:05,546: WARNING/MainProcess] /root/.local/lib/python3.11/site-packages/celery/worker/consumer/consumer.py:507: CPendingDeprecationWarning: The broker_connection_retry configuration setting will no longer determine
whether broker connection retries are made during startup in Celery 6.0 and above.
If you wish to retain the existing behavior for retrying connections on startup,
you should set broker_connection_retry_on_startup to True.
  warnings.warn(

[2024-02-11 12:17:05,547: INFO/MainProcess] mingle: searching for neighbors
[2024-02-11 12:17:06,555: INFO/MainProcess] mingle: all alone
[2024-02-11 12:17:06,570: INFO/MainProcess] celery@c92ab0fed943 ready.
[2024-02-11 12:17:06,573: INFO/MainProcess] Task version_check[ebfb439b-725a-44a9-a724-a439124e2b76] received
[2024-02-11 12:17:06,574: WARNING/ForkPoolWorker-16] [v0.4.6]: look for updates
WpnLehvOM6E: change status to priority
[2024-02-11 12:17:12,007: INFO/MainProcess] Task download_pending[1d411a99-3cfd-44b2-b7bd-88aefd03db01] received
[2024-02-11 12:17:12,008: WARNING/ForkPoolWorker-1] download_pending create callback
[2024-02-11 12:17:12,442: WARNING/ForkPoolWorker-1] W1Qyv6EI2hE: Downloading video
[2024-02-11 12:17:19,526: WARNING/ForkPoolWorker-1] W1Qyv6EI2hE: get metadata from youtube
[2024-02-11 12:17:22,569: WARNING/ForkPoolWorker-1] UCsXVk37bltHxD1rDPwtNM8Q: get metadata from es
[2024-02-11 12:17:22,645: WARNING/ForkPoolWorker-1] W1Qyv6EI2hE: get ryd stats
[2024-02-11 12:17:22,978: WARNING/ForkPoolWorker-1] W1Qyv6EI2hE: get sponsorblock timestamps
[2024-02-11 12:17:23,347: WARNING/ForkPoolWorker-1] W1Qyv6EI2hE: sponsorblock failed: 404
[2024-02-11 12:17:23,347: WARNING/ForkPoolWorker-1] W1Qyv6EI2hE-en: get user uploaded subtitles
[2024-02-11 12:17:23,348: WARNING/ForkPoolWorker-1] W1Qyv6EI2hE-en: get auto generated subtitles
[2024-02-11 12:17:27,020: INFO/ForkPoolWorker-16] Task version_check[ebfb439b-725a-44a9-a724-a439124e2b76] succeeded in 20.446417877014028s: None
[2024-02-11 12:17:28,475: WARNING/ForkPoolWorker-1] 1d411a99-3cfd-44b2-b7bd-88aefd03db01 Failed callback
[2024-02-11 12:17:28,478: ERROR/ForkPoolWorker-1] Task download_pending[1d411a99-3cfd-44b2-b7bd-88aefd03db01] raised unexpected: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')
Traceback (most recent call last):
  File "/root/.local/lib/python3.11/site-packages/celery/app/trace.py", line 477, in trace_task
    R = retval = fun(*args, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^
  File "/root/.local/lib/python3.11/site-packages/celery/app/trace.py", line 760, in __protected_call__
    return self.run(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/home/tasks.py", line 208, in download_pending
    videos_downloaded = downloader.run_queue(auto_only=auto_only)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/home/src/download/yt_dlp_handler.py", line 182, in run_queue
    vid_dict = index_new_video(
               ^^^^^^^^^^^^^^^^
  File "/app/home/src/index/video.py", line 411, in index_new_video
    video.check_subtitles()
  File "/app/home/src/index/video.py", line 371, in check_subtitles
    indexed = handler.download_subtitles(relevant_subtitles=subtitles)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/home/src/index/subtitle.py", line 131, in download_subtitles
    parser = SubtitleParser(response.text, lang, source)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/home/src/index/subtitle.py", line 192, in __init__
    self.subtitle_raw = json.loads(subtitle_str)
                        ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
[2024-02-11 12:17:28,480: WARNING/ForkPoolWorker-1] 1d411a99-3cfd-44b2-b7bd-88aefd03db01 return callback
[2024-02-11 12:17:28,583: ERROR/ForkPoolWorker-1] Unhandled Notification Exception
Traceback (most recent call last):
  File "/root/.local/lib/python3.11/site-packages/apprise/Apprise.py", line 582, in _notify_sequential
    result = server.notify(**kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.local/lib/python3.11/site-packages/apprise/plugins/NotifyBase.py", line 325, in notify
    send_calls = list(self._build_send_calls(*args, **kwargs))
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.local/lib/python3.11/site-packages/apprise/plugins/NotifyBase.py", line 413, in _build_send_calls
    for chunk in self._apply_overflow(
                 ^^^^^^^^^^^^^^^^^^^^^
  File "/root/.local/lib/python3.11/site-packages/apprise/plugins/NotifyBase.py", line 450, in _apply_overflow
    body = '' if not body else body.rstrip()
                               ^^^^^^^^^^^
AttributeError: 'dict' object has no attribute 'rstrip'

Anything else?

I love this app, thanks so much! :)

BTW yt-dlp (with no options) downloads it fine
[jayden@fedora ~]$ yt-dlp https://www.youtube.com/watch?v=W1Qyv6EI2hE [youtube] Extracting URL: https://www.youtube.com/watch?v=W1Qyv6EI2hE [youtube] W1Qyv6EI2hE: Downloading webpage [youtube] W1Qyv6EI2hE: Downloading ios player API JSON [youtube] W1Qyv6EI2hE: Downloading android player API JSON [youtube] W1Qyv6EI2hE: Downloading player 5e928255 [youtube] W1Qyv6EI2hE: Downloading m3u8 information [info] W1Qyv6EI2hE: Downloading 1 format(s): 303+251 [download] Destination: Can You Escape a Black Hole? #kurzgesagt #shorts [W1Qyv6EI2hE].f303.webm [download] 100% of 5.76MiB in 00:00:00 at 8.18MiB/s [download] Destination: Can You Escape a Black Hole? #kurzgesagt #shorts [W1Qyv6EI2hE].f251.webm [download] 100% of 749.29KiB in 00:00:00 at 5.03MiB/s [Merger] Merging formats into "Can You Escape a Black Hole? #kurzgesagt #shorts [W1Qyv6EI2hE].webm" Deleting original file Can You Escape a Black Hole? #kurzgesagt #shorts [W1Qyv6EI2hE].f251.webm (pass -k to keep) Deleting original file Can You Escape a Black Hole? #kurzgesagt #shorts [W1Qyv6EI2hE].f303.webm (pass -k to keep)

I'm having the same issue, but with different videos.

EDIT: Wait, no it's the same Kurzgesagt short causing it for me.

Looks like you're passing a dictionary into Apprise instead of a string it can use to notify from.

I can't speak for @lukyjay but I don't even use Apprise

Looks like you're passing a dictionary into Apprise instead of a string it can use to notify from.

Sorry but I have no idea what apprise is
Edit: found it - I do use that function to sync Jellyfin but I don't see how that would relate to this specific video downloading? Here are my settings for Apprise:

image

Looks like YT has a hiccup and not returning valid JSON for the subtitles. Quick testing showed, even though it's status code 200, it doesn't return anything, just an empty string. You can reproduce that on YT directly, open the video and select auto caption, and it won't show anything.

So we can implement some error handling there.

That isn't related with our apprise integration, that just fails further down the task execution, although I appreciate the input.

Looks like YT has a hiccup and not returning valid JSON for the subtitles. Quick testing showed, even though it's status code 200, it doesn't return anything, just an empty string. You can reproduce that on YT directly, open the video and select auto caption, and it won't show anything.

So we can implement some error handling there.

That isn't related with our apprise integration, that just fails further down the task execution, although I appreciate the input.

Thanks mate. Do I raise a bug for the apprise error? I'm not sure I understand it since I followed the docs on that bit

The apprise thing is not an error, that just isn't getting the expected data because things failed before.

fix is shipped with v0.4.7