socketry / async-http

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SocketError: getaddrinfo with iOS Personal Hotspot

trevorturk opened this issue · comments

Apologies in advance for this vague issue report, but I thought I should post something just in case future Googlers run into this issue, or maybe someone has an idea and/or some ideas about how to debug.

When I'm using macOS 12.6 while connected my iOS 16.0.2 iPhone's "Personal Hotspot" with Net::HTTP, I'm able to make HTTP requests like so:

uri = URI('http://example.com/')
=> #<URI::HTTP http://example.com/>
Net::HTTP.get_response(uri)
=> #<Net::HTTPOK 200 OK readbody=true>

However, when using Async::HTTP, I get an error:

internet = Async::HTTP::Internet.new
=> #<Async::HTTP::Internet:0x000000010c44eeb0 @clients={}, @options={}>
Async { internet.get("https://example.com") }
13.56s     warn: Async::Task [oid=0x16684] [ec=0x16698] [pid=78449] [2022-10-03 15:10:21 -0500]
               | Task may have ended with unhandled exception.
               |   SocketError: getaddrinfo: nodename nor servname provided, or not known
               |   → /Users/trevorturk/.rbenv/versions/3.1.2/lib/ruby/3.1.0/resolv.rb:770 in `ip'

I noticed my macOS DNS settings looked like so:

Screen Shot 2022-10-03 at 3 24 02 PM

...so I tried changing to use Cloudflare's 1.1.1.1 and then things worked as expected.

I'm happy enough with this setup since it appears to be working, but I wonder if other people might run into the issue going forward. If I can help testing/debugging somehow, please do let me know! Otherwise I may leave this issue open for a while and then close for future possible Googlers!

I think this is the same as socketry/async#180 - can you confirm it's the IPv6 address causing problems? i.e. does it work with just 172.20.10.1 (similar to how you tested 1.1.1.1).

Thanks so much, sorry I missed that other issue. I tested and confirmed that 172.20.10.1 works. I noticed some debugging ideas in that other issue. Please let me know if there's anything you'd like me to test, or feel free to close this as a dupe. I'm subscribed to the other issue, so I'll see activity there as well. Thank you!

Can you try the IPv6 address just by itself and reproduce the failure? i.e. it's not the combination of two addresses that causes the issue but the presence of an IPv6 address that causes failure.

I tested with the following, doing my best to avoid typos, so we'd just be testing the auto-populated address:

Screen Shot 2022-10-03 at 5 54 40 PM

...and the issue occurred: SocketError: getaddrinfo: nodename nor servname provided, or not known was raised. I think that's what you meant to test, but let me know!

Are you able to try on the latest release of macOS (Ventura, 13.0)? I can't reproduce it locally.

I am still able to reproduce:

Screenshot 2022-10-31 at 10 30 27 AM

I'm also on the latest iOS version (not beta, but the latest release)

I'm not sure how to help reproduce, but please do let me know if you have any ideas, or if you'd like to screen share or anything like that!

Thanks, that's really helpful, at least I know it's something I should be able to reproduce.

Okay, here is my next question, are you able to reproduce the issue using a public IPv6 DNS resolver? e.g. CloudFlare has 2606:4700:4700::1111. If you can reproduce the issue with that public server that is very helpful for me because then I should be able to set up my own environment to match.

I'm sorry to say that it seems I cannot reproduce the issue with 2606:4700:4700::1111 sorry!

Ahhh so it's something specific to some resolvers and not others!

It occurred to me today that I should ask if there's anything I can provide in terms of a description of my local environment, or if perhaps it would be helpful for me to test with a Docker environment I could share. (I'm wondering if there might be something different with openssl etc.) Please do let me know if anything like that would be helpful!

The fact you can't reproduce it with some servers is helpful knowledge. What might be the next step is if I can some how log into your environment and poke around. But, let me try for a repro on my end.

@ioquatix I just ran into this the other day and tried to run Resolv.getaddress on bin/rails c within the dev same environment as @trevorturk.

Loading development environment (Rails 7.0.4.3)
irb(main):001:0> Resolv.getaddress "google.com"
=> "74.125.136.100"

I can also help with repros if needed.

I put a binding.irb in the resolv.rb exception site and got this:

irb(#<Resolv::DNS::Requester::UnconnectedUDP:0x000000010685e970>):004:0> Addrinfo.ip(host).ip_address
/Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:4:in `ip': getaddrinfo: nodename nor servname provided, or not known (SocketError)
	from /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:4:in `sender'
	from <internal:prelude>:5:in `irb'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:770:in `sender'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:527:in `block in fetch_resource'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:1126:in `block (3 levels) in resolv'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:1124:in `each'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:1124:in `block (2 levels) in resolv'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:1123:in `each'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:1123:in `block in resolv'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:1121:in `each'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:1121:in `resolv'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:521:in `fetch_resource'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:507:in `each_resource'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:402:in `each_address'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:116:in `block in each_address'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:115:in `each'
	... 30 levels...
irb(#<Resolv::DNS::Requester::UnconnectedUDP:0x000000010685e970>):005:0> host
=> "fe80::887:c7ff:fe62:d64%en0"

I have a feeling something about this %en0 (network interface?!) in the ipv6 DNS hostname is not happy. Removing it works fine.

Addrinfo.ip("fe80::887:c7ff:fe62:d64").ip_address
=> "fe80::887:c7ff:fe62:d64"

Edit: I can also confirm that this %en0 seemingly gets added when I'm tethering to an AT&T device from macOS since it is listed as such in resolv.conf:

cat /etc/resolv.conf
#
# macOS Notice
#
# This file is not consulted for DNS hostname resolution, address
# resolution, or the DNS query routing mechanism used by most
# processes on this system.
#
# To view the DNS configuration used by this system, use:
#   scutil --dns
#
# SEE ALSO
#   dns-sd(1), scutil(8)
#
# This file is automatically generated.
#
nameserver fe80::887:c7ff:fe62:d64%en0
nameserver 172.20.10.1

But this isn't shown in the macOS networking settings if you look at DNS servers:

image

AFAIK en0 in macOS parlance is an identifier for Wi-Fi network interface as this shows:

$ networksetup -listallhardwareports | grep -C 2 en0

Hardware Port: Wi-Fi
Device: en0
Ethernet Address: f0:2f:5b:01:23:b8

It seems like Resolv is choking on this identifier when it likely should be entirely ignored. Might have to file a Ruby bug report for this.

I've created an issue in the resolv repo.

As Trevor did, I've isolated this behavior to within Async exclusively although on a newer Ruby version (while tethering to a mobile device connected to a ISP/provider like AT&T which supplies ipv6 domain name servers with a suffix, like fe80::887:c7ff:fe62:d64%en0), so while it might be related to Resolv behavior I can only reproduce it within an Async context:

irb(main):022:0> ::Resolv.getaddresses("google.com")
=> ["142.251.35.238", "2607:f8b0:4008:815::200e"]
irb(main):023:0> Async { ::Resolv.getaddresses("google.com") }
    2m     warn: Async::Task [oid=0x3c398] [ec=0x3c3ac] [pid=4196] [2023-04-26 13:57:23 -0400]
               | Task may have ended with unhandled exception.
               |   SocketError: getaddrinfo: nodename nor servname provided, or not known
               |   → /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:771 in `ip'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:771 in `sender'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:527 in `block in fetch_resource'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:1128 in `block (3 levels) in resolv'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:1126 in `each'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:1126 in `block (2 levels) in resolv'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:1125 in `each'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:1125 in `block in resolv'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:1123 in `each'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:1123 in `resolv'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:521 in `fetch_resource'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:507 in `each_resource'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:402 in `each_address'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:116 in `block in each_address'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:115 in `each'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:115 in `each_address'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:102 in `getaddresses'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:51 in `getaddresses'
               |     (irb):23 in `block in <top (required)>'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.5.0/lib/async/task.rb:158 in `block in run'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.5.0/lib/async/task.rb:310 in `block in schedule'
=> #<Async::Task:0x000000000003c398>

Now here's the kicker, I can't reproduce this on Ruby 3.0:

irb(main):007:0> RUBY_VERSION
=> "3.0.2"
irb(main):008:0> Async::VERSION
=> "1.31.0"
irb(main):009:0> ::Resolv.getaddresses("google.com")
=> ["142.250.217.174", "2607:f8b0:4008:815::200e"]
irb(main):010:0> Async { ::Resolv.getaddresses("google.com") }
=>
#<Async::Task:0x00000001079a1ea8
 @annotation=nil,
 @children=nil,
 @fiber=nil,
 @finished=nil,
 @head=nil,
 @logger=nil,
 @object_name=nil,
 @parent=nil,
 @reactor=
  #<Async::Reactor:0x00000001079a2510
   @annotation=nil,
   @blocked=0,
   @children=#<Async::Children:0x00000001079a1e58 @head=nil, @size=0, @tail=nil, @transient_count=0>,
   @guard=#<Thread::Mutex:0x00000001079a21c8>,
   @head=nil,
   @interrupted=false,
   @logger=nil,
   @object_name=nil,
   @parent=nil,
   @ready=[],
   @running=[],
   @scheduler=#<Async::Scheduler:0x00000001079a2218 @reactor=#<Async::Reactor:0x00000001079a2510 ...>>,
   @selector=nil,
   @tail=nil,
   @timers=
    #<Timers::Group:0x00000001079a23f8
     @events=
      #<Timers::Events:0x00000001079a23d0
       @queue=[],
       @sequence=#<Timers::PriorityHeap:0x00000001079a23a8 @contents=[]>>,
     @interval=#<Timers::Interval:0x00000001079a2290 @current=1004617.430651, @total=0.0>,
     @paused_timers=#<Set: {}>,
     @timers=#<Set: {}>>,
   @transient=false,
   @unblocked=[]>,
 @result=["142.250.217.174", "2607:f8b0:4008:815::200e"],
 @status=:complete,
 @tail=nil,
 @transient=false>

Ruby 3.1 / Async 2+ uses the new address_resolve fiber scheduler hook. I think that could be the root cause of the issue.

https://github.com/socketry/async/blob/main/lib/async/scheduler.rb#L152-L154

I wonder if our wrappers around getaddrinfo are not working correctly with interface suffix.

cc @bruno-

Scheduler was definitely part of the backtrace when I was started looking into this.

I believe this issue should be fixed by socketry/async#243. It would be great if people who can reproduce it, can test it. I'll cut a new release of Async 2.x soon.

I'm not on AT&T anymore, so I haven't been able to reproduce for a while, but perhaps @olivierlacan can give it a shot?