Stop iterating on the content that is 404'd or DMCA'd

Question

Stop iterating on the content that is 404'd or DMCA'd

fl0werpowers opened this issue 2 years ago · comments

Some content that is present in the archives either does not exist anymore due to it being deleted by the original uploader, or it is taken down via DMCA claims. The tool clearly emits the exceptions (as 'Download failed with status "404 Not Found"' and 'Download failed with status "403 Forbidden"' respectively), with the 403 one clearly specifying that the content in question has been struck by DMCA. Iterating through such content multiple times is a waste of time, and such media can be skipped to save time.

vibe · Answer 1 · Mon Nov 21 2022 23:48:32 GMT+0800 (China Standard Time)

these are the exceptions in question

FAIL. Media couldn't be retrieved from https://pbs.twimg.com/media/EbH_bxcUYAgxbki.png:orig because of exception: Download failed with status "404 Not Found". Response content: ""

FAIL. Media couldn't be retrieved from https://video.twimg.com/ext_tw_video/1560406436982804480/pu/vid/1280x720/m7-vUTLunERc4auB.mp4?tag=12 because of exception: Download failed with status "403 Forbidden". Response content: "{"error_code":2,"error_response":"Dmcaed"}"

Doug Pearson · Answer 2 · Thu Dec 29 2022 14:28:00 GMT+0800 (China Standard Time)

Agree. Was going to raise this issue myself. Thanks for the well written issue.