HTTP Error 413

Question

HTTP Error 413

want-to-export-group opened this issue 5 years ago · comments

want-to-export-group commented 5 years ago

I am trying to archive messages from a large private group. The script seems to run fine, until the "Fetching data" step. Here is the output (the name has been changed to "group"):

:: Downloading all topics (thread) pages...
:: Creating './group//threads/t.0' with 'categories/group'
:: Fetching data from 'https://groups.google.com/forum/?_escaped_fragment_=categories/group'...
--2019-12-20 13:16:16-- https://groups.google.com/forum/?_escaped_fragment_=categories/group
Resolving groups.google.com (groups.google.com)... 2607:f8b0:400d:c0f::8a, 172.217.197.102, 172.217.197.113, ...
Connecting to groups.google.com (groups.google.com)|2607:f8b0:400d:c0f::8a|:443... connected.
HTTP request sent, awaiting response... 413 Request Entity Too Large
2019-12-20 13:16:16 ERROR 413: Request Entity Too Large.

As you can see, there is an Error 413. What is causing this, and how can it be fixed?

want-to-export-group · Answer 1 · Sat Dec 21 2019 02:34:14 GMT+0800 (China Standard Time)

The test script works fine for "google-group-crawler-public" but fails for "google-group-crawler-public2" due to HTTP error 500. Could something be going wrong with the cookies?

Ky-Anh Huynh · Answer 2 · Fri Apr 10 2020 01:03:39 GMT+0800 (China Standard Time)

@want-to-export-group Was you able to resolve the issue?

Ky-Anh Huynh · Answer 3 · Fri Apr 10 2020 01:11:40 GMT+0800 (China Standard Time)

I haven't seen that issue. Maybe it's a temporary network issue, you can look at the wget command and retry if that helps.

Ky-Anh Huynh · Answer 4 · Sun Apr 12 2020 15:44:00 GMT+0800 (China Standard Time)

The test script works fine for "google-group-crawler-public" but fails for "google-group-crawler-public2" due to HTTP error 500. Could something be going wrong with the cookies?

Yes I can confirm this issue. Google has changed something to prevent our script from working :(

want-to-export-group · Answer 5 · Sun Apr 12 2020 22:15:26 GMT+0800 (China Standard Time)

No I was not able to resolve it Sent with [ProtonMail](https://protonmail.com) Secure Email. ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

…

On Thursday, April 9, 2020 1:03 PM, Ky-Anh Huynh ***@***.***> wrote: ***@***.***(https://github.com/want-to-export-group) Was you able to resolve the issue? — You are receiving this because you were mentioned. Reply to this email directly, [view it on GitHub](#34 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AOC4BKKULCT7CEXAVSRY2BTRLX5XVANCNFSM4J6BGZOQ).

Ky-Anh Huynh · Answer 6 · Sun Apr 12 2020 22:49:33 GMT+0800 (China Standard Time)

:( it's used to work. Now accessing from the web browser also generates an error https://groups.google.com/forum/?_escaped_fragment_=categories/google-group-crawler-public2

Ky-Anh Huynh · Answer 7 · Mon Apr 13 2020 12:32:54 GMT+0800 (China Standard Time)

By mistake google-group-crawler-public2 was set to private mode. Now it's fine. Btw, I have rewritten the script using curl hopefully it can help to resolve a few strange issue. Stay tuned.

Ky-Anh Huynh · Answer 8 · Mon Apr 13 2020 13:48:31 GMT+0800 (China Standard Time)

The problem should be fixed in the latest version 2.0.0 (using curl). Please have a look if it's better. Thanks.