net::ERR_ABORTED during headless testing
maximkoshelenko opened this issue · comments
Hello team. Need help with issue which only reproduce in headless mode. I have a test which is checked 200 and 206 result after clicking the footer links of site. But after adding PDF file as a footer link test crashed in headless mode (in debug mode it is working). Please help solve this problem :).
Steps to reproduce
Tell us about your environment:
- Puppeteer version:
- Platform / OS version: Windows
- URLs (if applicable): http://marketing.advanstar.info/mediakits/TC_MK_2016.pdf
- Node.js version:
npm -v : 5.6.0
What steps will reproduce the problem?
Please include code that reproduces the issue.
await page.goto('http://www.cancernetwork.com/', { waitUntil: "domcontentloaded" });
await page.waitForSelector('.expanded .menu');
let footerLinks = await page.evaluate(
() => Array.from(document.body.querySelectorAll('.expanded a[href]'), ({ href }) => href)
);
for (var r = 0; r < footerLinks.length; r++) {
let [response] = await Promise.all([
page.waitForNavigation(),
await page.goto(footerLinks[r], { waitUntil: "load" }),
]);
if (response._status == 206) {
expect(response._status).toBe(206);
} else {
expect(response._status).toBe(200);
}
response = '';
console.log('Footer link: ' + footerLinks[r] + ' was checked successfully');
}
What is the expected result?
Going to http://marketing.advanstar.info/mediakits/TC_MK_2016.pdf page with 200 or 206 status
What happens instead?
Without headless mode code is working, but in headless there is an error
net::ERR_ABORTED at http://marketing.advanstar.info/mediakits/TC_MK_2016.pdf
at navigate (node_modules/puppeteer/lib/Page.js:592:37)
I have tried execute code at https://try-puppeteer.appspot.com/ site
Code:
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('http://marketing.advanstar.info/mediakits/TC_MK_2016.pdf');
await browser.close();
Result:
Error running your code. Error: net::ERR_ABORTED at http://marketing.advanstar.info/mediakits/TC_MK_2016.pdf
Hi @maximkoshelenko,
Contrary to Headful Chrome, Chrome Headless doesn't know how to navigate to PDFs. I think it'll issue a download instead, but downloads are not supported yet: #299
Thanks. Hope downloads will be supported ASAP :)
@aslushnikov are there any way to know that the net::ERR_ABORTED
is issued because of a download when we catch the issue?
We could use request interception and to look at content type, but ideally I want to be able to handle it in the catch
Why is this closed? My issue occurs when redirecting...
How can this error due to a pdf file be distinguished from other errors? I also hit this problem and could try-catch it, but I am afraid I will oversee other errors unrelated to pdfs.
@gsouf Can you expand on how to implement that? The content type isn't known until there is a response, and the resourceType() for pdf pages is simply "document"
@deansg I'm doing something like this. I cannot share more because the rest is part of a more complex thing but this is the gist of it:
const page = openSomePuppeteerPage();
async function pageOnResponseRequest(response) {
if (response.frame() === page.mainFrame() && response.request().isNavigationRequest()) {
const statusCode = response.status();
const headers = response.headers();
// At this point you have access to status code and headers which you can use to detect that it's an html document, an image, a downloadable document, etc...
}
}
page.on('response', pageOnResponseRequest);
@aslushnikov Do you know which content types experience this behaviour? So that I can use gsouf's idea to filter them out
@deansg everything that is a download. I would say everything that is not a displayable text (html, maybe xml and json?) and images. Maybe video and audio too? Havent tested all of those
@gsouf My problem is that I faced websites that return javascript content type, and then continue loading until proper HTML is loaded. If I intercept the first response and immediately decide that the website isn't relevant because that content type isn't HTML bases, then I lose valuable information (I need only extract content only from websites that return html). I'm not sure whether it's better to build a blacklist of content types, or a whitelist.
@deansg the solution I proposed filters only navigation requests for the main frame. What you want to do is to process the request only if content type (from response headers) is html
@deansg I have never seen a website returning javascript content type and running properly. The browser wont process the javascript. I will just display it on screen as simple text
@gsouf when I try to navigate to the following website:
https://atelierhaussmann.de/en/
The 'response' event is called several times, and in several of the cases the content type is application/javascript
@deansg you'll have to figure out what's wrong because it does not occur for me. Even with your website, I confirm it has content-type text/html
I get the same error when I want to get the redirect chain of an url. any solution to exit the process after fetching the data?
Error: net::ERR_ABORTED at https://www.example.com/vip-dl/?filename=23309907.rar
at navigate (C:\Users\noora\AppData\Roaming\npm\node_modules\puppeteer\lib\FrameManager.js:120:37)
I get the same error when I want to get the redirect chain of an url. any solution to exit the process after fetching the data?
Error: net::ERR_ABORTED at https://www.example.com/vip-dl/?filename=23309907.rar
at navigate (C:\Users\noora\AppData\Roaming\npm\node_modules\puppeteer\lib\FrameManager.js:120:37)
A similar error I am also facing, Any help appreciated.
going over the same issue for the past few days, was working fine before, nothing has changed...
I need to test the download speed for video download. I face same issue.
That issue is related to a bad connection? I am face same issue, sometimes this error appear and others no.
In my case, the reason for this was that I was trying to run several jobs in parallel with promises, and I hadn't noticed that they were all using the same page object.
Another reason you may be experiencing this error consistently is that your system is telling chrome to use jemalloc
.
Turn off system-wide jemalloc
to get rid of this error.
See: #8246 (comment)
Another reason you may be experiencing this error consistently is that your system is telling chrome to use
jemalloc
.Turn off system-wide
jemalloc
to get rid of this error.See: #8246 (comment)
I use win10, when I called setCookie first, it will have the same error.
When I removed the code of cookie, it run OK.
I have captured package when request abort with wireshark:
The ip starts with 10.254
is my win10 pc.