API URL matching pattern is too rigid
FabulousCupcake opened this issue · comments
Unsure if I should file this here or kanasimi/CeJS; the issue seems to be this regexp pattern:
https://github.com/kanasimi/CeJS/blob/521b966ccf2810455c9d89ec893f478f06d4575a/application/net/wiki/namespace.js#L399
Steps to reproduce
Simply try getting a page from a domain that does not have subdomain or domain extension, or is only an IP, or have a port number attached to it:
const Wikiapi = require("wikiapi");
(async () => {
const url = 'https://gbf.wiki/api.php';
// These will also fail
// const url = 'http://localhost/api.php';
// const url = 'http://127.0.0.1/api.php';
// const url = 'https://www.example.com:8080/api.php';
const wiki = new Wikiapi(url);
const data = await wiki.page('Main_Page');
console.log(data.wikitext);
})();
Expected
It just works
Actual
Get the following error:
api_URL: Unknown project: [https://gbf.wiki/api.php]! Using default API URL.
And it simply queries the default (en wikipedia) instead.
Workaround
I can still get it working by running a local reverse proxy with nginx
and modifying /etc/hosts
, but this is very non-ideal.
If both https://
and /api.php
are present, continue.
If both
https://
and/api.php
are present, continue.
I'm not sure what you're trying to convey; as shown in the example it does not work simply when https://
and /api.php
are present.
To me, the pattern /^(https?:)?(?:\/\/)?(([a-z][a-z\d\-]{0,14})\.([a-z]+)+(?:\.[a-z]+)+)/i
feels unnecessary and overly rigid; if you need to grab the parts of the url it's probably better to let the URL API do the job — and to check if it's a valid mediawiki api endpoint, it's probably best to just assume the url is correct, send some queries, and see if it fails.
I suggest the pattern should just check if http*:// and /api.php are present. But Kanasimi with find something smarter ; )
Late reply but can confirm that it works now, thank you!