gpodder / gpodder

The gPodder podcast client.

Home Page:http://gpodder.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

youtube RSS feeds broken?

pieska opened this issue · comments

gpodder got 404 from youtube when updating feeds:

1706553478.697387 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.youtube.com:443
1706553478.750239 [gpodder.gtkui.main] INFO: Aktualisiert DEFCONConference (1/1)
1706553478.925225 [urllib3.connectionpool] DEBUG: https://www.youtube.com:443 "GET /feeds/videos.xml?channel_id=UC6Om9kAkl32dWlDSNlDS9Iw HTTP/1.1" 404 1613
1706553478.925797 [gpodder.gtkui.main] ERROR: Error updating feed: DEFCONConference: not found
1706553478.932774 [gpodder.gtkui.main] DEBUG: Updated channel is active, updating UI
1706553478.960462 [gpodder.my] DEBUG: Processing received episode actions...
1706553478.974364 [gpodder.my] DEBUG: Received episode actions processed.
1706553478.974485 [gpodder.dbsqlite] DEBUG: Commit.

the issue started today,

The feeds have gone down for several days at a time in the past. However, I am also unable to render the youtube website in firefox with ubo enabled, which might mean they've made a big change to block ad blockers, and could have removed the feeds. Or maybe the change simply caused an outage.

I need to fix get_channel_id_url() so it can extract channel ID from the URL, if the URL contains it. That would fix the issue of not receiving data from a 404 feed. And I fixed an issue a while back where gpodder would erase coverart and description when updating during outages, but it appears to not be working because my descriptions have been erased.

yt-dlp is still able to build a list of episodes and download videos, turn on the extension and let it manage channels. You will receive get_channel_id_url() errors but it is working. Keep in mind that yt-dlp fetches every single video page, every time you update. So it is very slow.

It is possible to query an episode list significantly faster than yt-dlp, and if the feeds go away, I might try to do it. I wrote an unreleased youtube browser that reads chunks of 30 videos at a time in a fraction of second (yt-dlp takes a fixed 5 seconds + 1 second per video). The initial fetch of a channel could use yt-dlp to get the full list, but updates could fetch one or more chunks to find new episodes and then only perform a page fetch for each new video, instead of for all existing videos. Metadata changes to old videos wouldn't be seen with this approach, and videos deleted from the channel also wouldn't be deleted from gpodder.

https://news.ycombinator.com/item?id=39175495

They might have begun detecting multiple requests across RSS feeds and a bug caused the outage. If fixed, it would mean the end of the feeds, even though they still exist. Unless the threshold is high enough to allow a user to fetch 100+ unique feeds a day.

maybe some anti-abuse mechanism: https://issuetracker.google.com/issues/322736318?pli=1

someone on the internet wrote a scraperproxy: https://gist.github.com/yunruse/93dd40719568dccccebcca26d9ed1ccb, bookmarked, might come in handy