jsdom / jsdom

A JavaScript implementation of various web standards, for use with Node.js

Repository from Github https://github.comjsdom/jsdomRepository from Github https://github.comjsdom/jsdom

JSDOM.fromURL - Valid URL causes uncaught exception

sjoerdvanderhoorn opened this issue · comments

Basic info:

  • Node.js version: v20.11.1
  • jsdom version: 25.0.1

Minimal reproduction case

const jsdom = require("jsdom");
const { JSDOM } = jsdom;

const url = "https://www.woshalderberge.nl/"; // This page seems to redirect the request to a site starting with http:///

try {
    new URL(url); // Validate URL
    JSDOM.fromURL(url).then(dom => {
        console.log(dom.window.document.documentElement.outerHTML);
    }).catch(error => {
        console.error("Error loading the site:", error);
    });
} catch (error) {
    console.error("Invalid URL or unexpected error:", error);
}

Instead of any of the try...catch blocks catching the error, this will fail with an uncaught exception:

Uncaught TypeError TypeError: Invalid URL
at URL (<node_internals>/internal/url:775:36)
at _processResponse (c:\Users\x\node_modules\jsdom\lib\jsdom\living\helpers\http-request.js:195:11)
at (c:\Users\x\node_modules\jsdom\lib\jsdom\living\helpers\http-request.js:107:12)
at onceWrapper (<node_internals>/events:633:26)
at emit (<node_internals>/events:518:28)
at parserOnIncomingClient (<node_internals>/_http_client:693:27)
at parserOnHeadersComplete (<node_internals>/_http_common:119:17)
at socketOnData (<node_internals>/_http_client:535:22)
at emit (<node_internals>/events:518:28)
at addChunk (<node_internals>/internal/streams/readable:559:12)
at readableAddChunkPushByteMode (<node_internals>/internal/streams/readable:510:3)
at Readable.push (<node_internals>/internal/streams/readable:390:5)
at onStreamRead (<node_internals>/internal/stream_base_commons:190:23)
at callbackTrampoline (<node_internals>/internal/async_hooks:130:17)
--- HTTPCLIENTREQUEST ---
at init (<node_internals>/internal/inspector_async_hook:25:19)
at emitInitNative (<node_internals>/internal/async_hooks:202:43)
at tickOnSocket (<node_internals>/_http_client:803:10)
at onSocketNT (<node_internals>/_http_client:897:5)
at processTicksAndRejections (<node_internals>/internal/process/task_queues:83:21)
--- TickObject ---
at init (<node_internals>/internal/inspector_async_hook:25:19)
at emitInitNative (<node_internals>/internal/async_hooks:202:43)
at emitInitScript (<node_internals>/internal/async_hooks:505:3)
at nextTick (<node_internals>/internal/process/task_queues:132:5)
at onSocket (<node_internals>/_http_client:863:11)
at setRequestSocket (<node_internals>/_http_agent:537:7)
at (<node_internals>/_http_agent:292:9)
at (<node_internals>/_http_agent:333:5)
at (<node_internals>/internal/util:531:12)
at createSocket (<node_internals>/_http_agent:342:5)
at addRequest (<node_internals>/_http_agent:288:10)
at ClientRequest (<node_internals>/_http_client:337:16)
at request (<node_internals>/https:378:10)
at _performRequest (c:\Users\x\node_modules\jsdom\lib\jsdom\living\helpers\http-request.js:106:28)
at Request (c:\Users\x\node_modules\jsdom\lib\jsdom\living\helpers\http-request.js:27:10)
at fetch (c:\Users\x\node_modules\jsdom\lib\jsdom\browser\resources\resource-loader.js:95:31)
at (c:\Users\x\node_modules\jsdom\lib\api.js:128:51)
at processTicksAndRejections (<node_internals>/internal/process/task_queues:95:5)
--- Promise.then ---
at fromURL (c:\Users\x\node_modules\jsdom\lib\api.js:113:30)
at (c:\Users\x\Desktop\jsdom\test.js:8:11)
at Module._compile (<node_internals>/internal/modules/cjs/loader:1376:14)
at Module._extensions..js (<node_internals>/internal/modules/cjs/loader:1435:10)
at Module.load (<node_internals>/internal/modules/cjs/loader:1207:32)
at Module._load (<node_internals>/internal/modules/cjs/loader:1023:12)
at executeUserEntryPoint (<node_internals>/internal/modules/run_main:135:12)
at (<node_internals>/internal/main/run_main_module:28:49)

How does similar code behave in browsers?

N/A

Great bug report.

This is occurring because the server includes Location: http://, and http:// is not a valid relative or absolute URL. So, this should definitely error. But, as you say, it should be caught.

The problematic code is

const nextURL = redirectAddress.startsWith("https:") ?
new URL(redirectAddress) :
new URL(redirectAddress, this.currentURL);
and it needs some try/catch blocks, I guess.