soruly / trace.moe

Anime Scene Search by Image

Home Page:https://trace.moe

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to collect/download such large scale anime videos ?

zhfkt opened this issue · comments

commented

Hi Soruly,

First, thank you for developing the trace.moe.

I would like to create an anime videos repository in my project. So I am interested in the method or open source library to download large scale anime videos according to the anime list (mal? anidb?) . Could you have a chance to share the experience on how to collect/download such large scale anime videos in your project trace.moe ? Did you manually download the anime videos and organize the video files ? Did you use some open source automation tools/script to download the videos in regular ? I checked your project anilist-crawler but which seems only crawl the metadata of anime. I also checked your 2019 slides but it didn't mention as well.

Thank you !

A bit sensitive to talk about tools that can download videos on GitHub. You know youtube-dl has just been taken down by GitHub. I'll answer your questions as much as I can.
You can take a look at how to download stuff using RSS. Such feature is usually built-in or achievable via plug-ins, so I don't have to program anything for that. The only thing I need is to manually curate a list of regex keywords that maps to anilist IDs every season. Matching entries would be stored in different folders according to the anilist ID defined in the list. And then the file system watcher in sola can do the rest.

commented

@soruly Thank you for replying.

I searched the keyword "RSS" and found some RSS anime feeds website such as "shana project". It seems that there are serveral websites providing the RSS feed for each season. I think it could be a solution for collecting the animes in recent seasons.

However, I found there is a limitation for the RSS feed that it could not provide the "historic information" - https://stackoverflow.com/questions/576552/how-do-i-fetch-all-old-items-on-an-rss-feed . So currently it seems that we only could retreive some recent animes instead of all animes (some animes may be from 10 - 20 years ago). Not sure whether I understand it correctly.

Ya that's true. Finding old anime is very hard now.