-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve error message for unresponsive DNS Resolver #8262
Comments
I guess that if the message could be expanded with the host information available so far it would be an improvement. But since there could probably be a number of reasons for an unexpected name resolution I would not put probable causes in the error message. Only data. |
@bcardiff Data such as connection to 127.0.0.11 timed out would be helpful to include. |
I agree with that. But not with something like "check your system configuration" since it might not be a miss-configuration. |
Fair and that might error be a bit too vague anyways. The more data on what went wrong the better. |
So, I looked into the code and that particular exception message is just passed up from the POSIX getaddrinfo method (a C bound method) and not Crystal code. Unfortunately, it appears that's about as good as you can get since the actual return from getaddroinfo for your example would likely have been EAI_AGAIN per the docs (https://man7.org/linux/man-pages/man3/getaddrinfo.3.html). This was just a long way of saying I don't think what you're asking for is going to be possible without using a different method for name resolution, which would just make things messier than shunting down to getaddrinfo. |
OK with this diff https://gist.github.com/rdp/26f5cde11e8e886b71df5bd42a8730d0 running this:
results in this message:
Is that better? Accurate? :) |
That’s much more clear. Can you include the IP Address of the responding DNS Server?
What happens if the DNS Server times out or can’t connect?
…On Sat, Nov 16, 2019 at 1:07 AM, Roger Pack ***@***.***> wrote:
OK with this diff https://gist.github.com/rdp/26f5cde11e8e886b71df5bd42a8730d0
running this:
require "socket"
client = TCPSocket.new("localhosttt", 1234)
results in this message:
Unhandled exception: No address found for localhosttt:1234 over TCP when attempting a DNS lookup (Socket::Addrinfo::Error)
Is that better? Accurate? :)
—
You are receiving this because you authored the thread.
Reply to this email directly, [view it on GitHub](#8262?email_source=notifications&email_token=AB6YWOOGYFE7MO6BZZZMIKTQT6E2NA5CNFSM4I45DEBKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEHKO2Q#issuecomment-554608490), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AB6YWOLFNYH7G47AN47NOCLQT6E2NANCNFSM4I45DEBA).
|
It is attempting to lookup the IP Address when it fails :) It'll have a different message if it can't resolve it, not sure what (basically the first part of that string will change...) :) |
I mean of the DNS Resolver itself: For example if Google DNS 8.8.8.8 returns NXDomain then include in the exception data that 8.8.8.8 says NXDomain.
…On Sat, Nov 16, 2019 at 1:35 AM, Roger Pack ***@***.***> wrote:
It is attempting to lookup the IP Address when it fails :) It'll have a different message if it can't resolve it, not sure what (basically the first part of that string will change...) :)
—
You are receiving this because you authored the thread.
Reply to this email directly, [view it on GitHub](#8262?email_source=notifications&email_token=AB6YWOLPEQMY3TB7JLXPWNDQT6IDPA5CNFSM4I45DEBKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEHK3XA#issuecomment-554610140), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AB6YWONAZ32WQZWMQIW4ZRDQT6IDPANCNFSM4I45DEBA).
|
Unfortunately it's just making a call out to "C code land" so it doesn't know exactly which DNS it may end up using, it just uses whatever the kernel does underneath... :) |
Does the kernel not give detailed responses? How does the C program
|
dig is probably not using getaddrinfo (that crystal uses), it's probably
hitting the DNS servers directly, itself, instead of having the kernel look
it up for it...
…On Sat, Nov 16, 2019 at 11:11 AM Nathaniel Suchy ***@***.***> wrote:
Unfortunately it's just making a call out to "C code land" so it doesn't
know exactly which DNS it may end up using, it just uses whatever the
kernel does underneath... :)
Does the kernel not give detailed responses? How does the C program dig
know which server responded from the kernel?
% dig crystal-lang.org
; <<>> DiG 9.10.6 <<>> crystal-lang.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63949
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;crystal-lang.org. IN A
;; ANSWER SECTION:crystal-lang.org. 59 IN A 13.32.238.247crystal-lang.org. 59 IN A 13.32.238.64crystal-lang.org. 59 IN A 13.32.238.160crystal-lang.org. 59 IN A 13.32.238.38
;; Query time: 31 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Sat Nov 16 13:11:00 EST 2019
;; MSG SIZE rcvd: 109
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#8262?email_source=notifications&email_token=AAADBUFP5HURZN7FX76L3WDQUAZU7A5CNFSM4I45DEBKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEHXHNA#issuecomment-554660788>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAADBUBNYMFLRQRJ4SOB72TQUAZU7ANCNFSM4I45DEBA>
.
|
How does it know my system's DNS resolver is 8.8.8.8? |
I don't know it probably looks it up using some command line program or config files etc? https://unix.stackexchange.com/questions/28941/what-dns-servers-am-i-using/28958 You could check its source and tell us! :) Or is there a better error message possible maybe? |
What about cases where you have more than one DNS Resolver configured? What if 8.8.8.8 doesn't respond to 8.8.4.4 is required? Config files can't tell you that. |
I presume that 'command line programs' can (nmcli) but not sure how dig discovers its list...per se. Anyway I don't think there's an easy C "method call" to lookup the DNS servers, and it's so far distant from the actual c call |
I think it solves the problem but a larger discussion around a reimplementation is necessary. Perhaps we could borrow some functions from dig in the future?
…On Sat, Nov 16, 2019 at 8:17 PM, Roger Pack ***@***.***> wrote:
I presume that 'command line programs' can (nmcli) but not sure how dig discovers its list...per se. Anyway I don't think there's an easy C "method call" to lookup the DNS servers, and it's so far distant from the actual c call getaddrinfo (unfortunately) that I still think it's going to be hard to actually add DNS info into the error message. At least for me... :) Let me know if the previously proposed is good enough and I could do a PR for it :)
—
You are receiving this because you authored the thread.
Reply to this email directly, [view it on GitHub](#8262?email_source=notifications&email_token=AB6YWOOAX6TXW3JGW6PHNPDQUCLUBA5CNFSM4I45DEBKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEH6V3A#issuecomment-554691308), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AB6YWOJ5KOQZDRJZQOWIZYDQUCLUBANCNFSM4I45DEBA).
|
…s may be connected to DNS crystal-lang#8262
If getaddrinfo eventually has drop-in replacements at some point that themselves lookup the DNS servers then yeah, we'd have access to more info to return. refs: #8376 #4236 #2660 https://stackoverflow.com/a/2157622/32453 My latest incantation is this message:
("including" since it can also includes .local domains, etc/hosts, etc...) Any objections? If not then I'll do a PR. Thanks! |
That looks great. Nice work 😎
…On Wed, Nov 20, 2019 at 5:35 PM, Roger Pack ***@***.***> wrote:
If getaddrinfo eventually has drop-in replacements at some point that themselves lookup the DNS servers then yeah, we'd have access to more info to return. refs: [#8376](#8376) [#4236](#4236) [#2660](#2660) https://stackoverflow.com/a/2157622/32453
My latest incantation is this message:
getaddrinfo: No address found for badhostname:80 over IP when attempting host address resolution. Hint: check hostname, check resolution system (including DNS). (Socket::Addrinfo::Error)
("including" since it can also includes .local domains, etc/hosts, etc...)
Any objections? If not then I'll do a PR. Thanks!
—
You are receiving this because you authored the thread.
Reply to this email directly, [view it on GitHub](#8262?email_source=notifications&email_token=AB6YWONPGCQ3RFZ3223L4OLQUW3T3A5CNFSM4I45DEBKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEVWXLA#issuecomment-556493740), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AB6YWOMVDTXGSJIBZ2YV4S3QUW3T3ANCNFSM4I45DEBA).
|
…s may be connected to DNS crystal-lang#8262
@rdp is correct -
Probably by reading A trick you might want to know when debugging DNS is to test if Yeah, this is probably not what you expect from DNS resolution on UNIX, but it is what it is unfortunately. |
…s may be connected to DNS crystal-lang#8262
…s may be connected to DNS crystal-lang#8262
OK this is live in 0.33.0 (I tried to add as much as they'd let me get away with :) |
Why this is throwing a name resolution failure? |
@thelinuxlich Probably because |
When attempting to send a request with
http/client
today inside of Docker, the OpenVPN Client sets a route of0.0.0.0/0
, this breaks connections to Docker DNS 127.0.0.11, when making a request to a host, I receive the error message:(ERROR) Socket::Addrinfo::Error getaddrinfo: Temporary failure in name resolution /usr/share/crystal/src/socket/tcp_socket.cr:75:15 in 'initialize'
, after a while I realized that this error only happened in containers where OpenVPN was running.For those curious, an OpenVPN Client inside the Docker contain allows our developers to copy production data into local host for testing how of code works against our production databases. For security reasons our MySQL and PostgreSQL instances are not exposed to the public internet and require users to be connected to our internal VPN. This was tricky to debug until we realized production wasn't affected for some reason and discovered that OpenVPN was the culprit.
It would be helpful to my efforts if the error message included something like "The DNS Resolver x.x.x.x is not responding, check your system configuration.", it would of saved many hours. I consider myself a noob when it comes to networking and can imagine this might confuse other users too. Thoughts?
Error trace
The text was updated successfully, but these errors were encountered: