Tunnelling doesn't work in v3.1.0
pkey opened this issue · comments
Description
With the new version of needle
(3.1.0) the current approach (here) to set up tunnelling doesn't work anymore (version 3.0.0 works fine). Scanning the network using Wireshark shows that needle
ends up looping CONNECT
requests to the proxy. I suspect that needle
tries to use both the tunnel
agent while also passing proxy parameters from the environment (HTTP_PROXY
and HTTPS_PROXY
) and thus ends up in this weird state.
We were previously using global agent together with _PROXY
environment variables to force needle
to do CONNECT
requests but since the new version that also doesn't work (which might be a different issue)
How to reproduce
- setup local
mtimproxy
(or any other proxy) - set up tunnelling using the
tunnel
agent as describe in the documentation, pointingproxy
andport
to themtimproxy
. - set up
HTTP_PROXY
andHTTPS_PROXY
environment variables to point to the same proxy - try and make a network call using
needle
Expected behaviour
When HTTP_PROXY
and HTTPS_PROXY
is set, and tunnel
is configured, needle
should make a CONNECT
request to the proxy and establish a tunnel
Actual behaviour
needle
tries to establish CONNECT
request but doesn't succeed
Hi and thanks for the detailed bug report. Would it be possible to see a small code snippet so I can reproduce the error quickly?
Here's the code snippet:
var needle = require('needle');
var tunnel = require('tunnel');
var myAgent = tunnel.httpOverHttp({
proxy: { host: '127.0.0.1', port: 8080 }
});
needle.get('https://github.com/status', {agent: myAgent} ,function (error, response) {
if (!error && response.statusCode == 200)
console.log(response.body);
else console.log(error);
});
Make sure to npm install needle tunnel
and then set _PROXY
environment variables to point to the same proxy (in my case127.0.0.1:8080
) as the agent configuration above.
When I run this with needle
version 3.1.0
installed, in Wireshark, I see attempts to CONNECT 127.0.0.1:8080 HTTP/1.1
and HTTP/1.1 502 Bad Gateway (text/html)
whereas with version 3.0.0
, I can see CONNECT github.com:443 HTTP/1.1
and HTTP/1.1 200 Connection established
- which is what I would expect. Mind that both result in Server disconnected
but the symptoms are the same ones we are experiencing in our own system so I think this is a good example.
Let me know how it goes, I will also try and debug though I am not very familiar with the codebase of needle
.
Let me add a bit more context and details.
What we are trying to achieve is to use needle with an HTTP/S proxy. In secure setup, HTTP clients are expected to send proxied requests for HTTPS resources through a HTTP CONNECT tunnel.
As far as I understand, needle doesn't support CONNECT requests. Therefore we are using https://github.com/gajus/global-agent to patch node's http agent to provide CONNECT-capable proxy support. There's similar libraries like https://github.com/koichik/node-tunnel (deprecated) or https://github.com/TooTallNate/node-http-proxy-agent.
The introduction of #382 picks up HTTP_PROXY
/HTTPS_PROXY
from environment variables and does not allow needle to be used without proxy if those environment variable are present.
We're looking for a way to opt-out of needle picking up proxy configuration from environment variables.
Yes, sorry. I'll make some time this week to take a look into this. :)
During the analysis and debugging, I noticed that no consistent distinction is made between http_proxy
and https_proxy
. It is sufficient if one of the two is set, then this is used for all connections. If both are set, the http_proxy
is used.
My test snipped:
var needle = require('needle');
needle.get('https://github.com/status', function (error, response) {
if (!error && response.statusCode == 200)
console.log(response.body);
else console.log(error);
});
The results (via proxy: export HTTPS_PROXY=http://localhost:8888
):
squid | tinyproxy | |
---|---|---|
HTTPS websites | ❌ | |
HTTP sites | ✅ | ✅ |
For HTTPS pages, the connection goes through the tinyproxy, but the proxy tries to connect the destination via HTTP. An attempt with CURL through the tinyproxy works without problems.
Some return values:
error from squid
CacheErrorInfo - ERR_READ_ERROR&body=CacheHost: d4e570ebcbe2
ErrPage: ERR_READ_ERROR
Err: [none]
TimeStamp: Fri, 29 Dec 2023
ClientIP: 10.10.x.x
ServerIP: github.com
HTTP Request:
GET /status HTTP/1.1
Accept: */*
User-Agent: Needle/3.3.0 (Node.js v18.17.1; linux x64)
Host: github.com
Connection: close
some output from node:
_header: 'GET https://github.com/status HTTP/1.1\r\n' +
'accept: */*\r\n' +
'user-agent: Needle/3.3.0 (Node.js v18.17.1; linux x64)\r\n' +
'host: github.com\r\n' +
'Connection: close\r\n' +
'\r\n',
method: 'GET',
path: 'https://github.com/status',
host: 'localhost',
protocol: 'http:',
statusCode: 502,
statusMessage: 'Bad Gateway',
I cannot see how is send the CONNECT request.
curl example
curl https://github.com/status -v
* Uses proxy env variable HTTPS_PROXY == 'http://localhost:8888'
* Trying 127.0.0.1:8888...
* Connected to (nil) (127.0.0.1) port 8888 (#0)
* allocate connect buffer!
* Establish HTTP proxy tunnel to github.com:443
> CONNECT github.com:443 HTTP/1.1
> Host: github.com:443
> User-Agent: curl/7.81.0
> Proxy-Connection: Keep-Alive
I have found a workaround for me:
var { ProxyAgent } = require('proxy-agent');
var needle = require('needle');
needle.get('https://github.com/status',{ agent: new ProxyAgent(), use_proxy_from_env_var: false }, function (error, response) {
if (!error && response.statusCode == 200)
console.log(response.body);
else console.log(response);
});
This is a similar setting we're using needle. We're using proxy-agent or global-agent, but the earlier changes for needle to pick up the env variables broke this.
@tomas with use_proxy_from_env_var: false
implemented, in my opinion, we can close this issue.
@dklimpel happy to leave this open if your case isn't fully covered yet.
IMHO this is open. Needle supports:
- HTTP Proxy forwarding, optionally with authentication
And that is not the case. There is no support for https_proxy
at the moment.