icy / google-group-crawler

[Deprecated] Get (almost) original messages from google group archives. Your data is yours.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Only retrieved the 50 most recent posts

tjluoma opened this issue · comments

I did this:

export _GROUP="bbedit"
./crawler.sh -sh > curl.sh
/opt/homebrew/bin/bash curl.sh

and it saved like 4000+ files, but the 'mbox' folder only has 50 items in it, and https://groups.google.com/g/bbedit says there are 4,444 messages in the group.

Is there something else I need to do? This is a public Google group so I thought that what I did was sufficient.

Thanks!

@tjluoma right that's fine. Let me try on my laptop if I can reproduce your issue. Thanks

@tjluoma I have given a try, and I have 759 messages in mbox folder, and it's still counting.

$ pwd
/home/foo/projects/icy/google-group-crawler/bbedit/mbox

$ ls | wc -l
826

I'd suggest you to turn on verbose stuff as below to see how your script is working and/or they have any issues. Can you actually open the script curl.sh and modify some curl options if necessary. Please let me know if your next try is better. Thanks

$ /opt/homebrew/bin/bash -x curl.sh