Exceen / 4chan-downloader

Python3 script to continuously download all images/webms of multiple 4chan thread simultaneously - without installation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

arguments -t or -c not recognized

unalignedcoder opened this issue · comments

There seem to be a contradiction between instructions and reality

Untitled

I really wanted to use that -t argument but it won't let me.

Did you download the zip from the release page? Because that version is almost 5 years old by now.

Okay where from then kind sir

Code -> Download ZIP

It works as intended now, but why does it create two folders from the same thread, one in /downloads and the other in /new?

It works as intended now, but why does it create two folders from the same thread, one in /downloads and the other in /new?

This is exactly what the author intended: #34 (comment)


This is unrelated, but I have noticed a bug that causes files to be downloaded both to downloads/ and new/, creating duplicates. I think (?) it was introduced by one of my changes because it didn't happen before I ever contributed to the project. This is a subject for a whole new ticket.

commented

why does it create two folders from the same thread, one in /downloads and the other in /new?

If you delete a file in the /downloads folder, the script handles this situation as if the file was never downloaded at all. That means that if the file doesn't exist in /new folder either, it will be downloaded into both folders again. If you only delete the file in the /new folder, nothing happens at all.

The reason for this behavior is that I wanted a full archive of a thread (which is inside the /downloads folder) and additionally to that I wanted to just keep the files I actually want to keep and remove the rest (which is the /new folder essentially).

The /downloads folder basically just memorizes which files have already been downloaded so that the script can determine which files in the thread are new and still need to be downloaded. In theory you could just save a list of already downloaded files into a json-file instead of storing the full file in the /downloads folder.

I didn't delete any files, yet this thread had all its content duplicated from the start:

image
image

That's a 200mb folder twice. Is this intentional? If the /downloads folder is used only for reference, perhaps it could use the thumbnails, instead of the full file?

As a quick workaround, I've added the -b --backup argument:

image
image

without the -b, it only saves images in /downloads

commented

That's a 200mb folder twice. Is this intentional? If the /downloads folder is used only for reference, perhaps it could use the thumbnails, instead of the full file?

This is intended, yes.

As a quick workaround, I've added the -b --backup argument:

without the -b, it only saves images in /downloads

I might push something later to omit the /new folder.

commented

added --no-new-dir argument