Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dns: add new DNS resolver and set it as default #1843

Closed
wants to merge 39 commits into from

Conversation

mscdex
Copy link
Contributor

@mscdex mscdex commented May 30, 2015

This adds a new pure JavaScript resolver as the default DNS resolver and makes the old (c-ares) resolver behind a --use-old-dns flag.

@brendanashworth brendanashworth added the dns Issues and PRs related to the dns subsystem. label May 30, 2015
@mscdex
Copy link
Contributor Author

mscdex commented May 30, 2015

Some things to note:

  • There is no EDNS support (node/io.js isn't using this at the moment, but c-ares supports it with an explicit flag) or DNSSEC support currently
  • There is still room for various improvements, including:
    • Getting better use out of (bound) UDP sockets and TCP sockets. Whether that means being able to send multiple queries at once or simply keeping the socket open to send another separate query (some kind of queueing system).
    • Caching identical queries globally so that we don't create a bunch of duplicate Buffers.
    • Add file/registry watchers to keep in-memory system DNS settings up-to-date. One issue with this is that we'd need to define how these watchers and overrides set by the user (e.g. setServers() and the like) interact with each other.
    • Currently in the new dns.lookup(), you can pass underlying resolver-specific flags that get ORed to the current system-wide DNS options. This might be fine for some cases, but it may also be useful to completely override system-wide DNS options (e.g. /etc/resolv.conf or process.env.RES_OPTIONS has use_vc set which forces TCP, but at runtime the user wants to try UDP first). I'm not sure what the API would look like for something like that though.

Now for some questions:

  • What (DNS) features do we want to support initially?
  • Currently both the old and new dns modules' dns.lookup() executes the callback immediately if the user is trying to do a normal resolve on an IP address (it passes back the input value as-is). If the V4MAPPED hint is in use though, should we convert IPv4 address inputs to IPv4-mapped IPv6 address first (e.g. '127.0.0.1' would be converted to '::ffff:127.0.0.1')?
  • Do we want to start returning actual DNS errors (e.g. NXDOMAIN, etc.) instead of ENOTFOUND and similar c-ares-specific errors?

/cc @nodejs/collaborators

@ChALkeR
Copy link
Member

ChALkeR commented May 30, 2015

Does this fix #894?

if (!running) {
next();
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe move this to test/common.js?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dunno, I just copied it from one of the other existing test files.

@mscdex
Copy link
Contributor Author

mscdex commented May 30, 2015

@ChALkeR It definitely doesn't crash when calling dns.setServers(). Currently what resolve() does is get the current nameserver array at the very beginning and works off of that and dns.setServers() replaces the entire array at once (not mutating the existing array). This means that only new calls to resolve() will use the new nameserver list.

}

if (typeof resolver === 'function')
resolver(hostname, callback);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old implementation did return a resolver instance (https://github.com/nodejs/io.js/blob/8059393934c2ed0e3e7a179f619b803291804344/lib/dns.js#L271) here. Not sure if it's worth to do so, as apparently this return value was never documented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, a request object. There is no such thing to return anymore and I hate to just return some fake object just for the sake of returning something.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree, but it's a tiny change of undocumented functionality. I'd suggest putting a semver-minor because of that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well it may already be a semver-major because it's replacing a lot of machinery. At least I thought that was the case for the replacement of the url module and the http parser.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it may as well be a major, because there ought to be more differences.

@silverwind
Copy link
Contributor

Noticed another difference: Responses without an answer section triggered an ENODATA in the old implemenation, but no error and an empty array is given in the new one. Is that intentional?

edit: here's such an response:

Domain Name System (response)
    [Request In: 32]
    [Time: 0.001254000 seconds]
    Transaction ID: 0x0000
    Flags: 0x8180 Standard query response, No error
        1... .... .... .... = Response: Message is a response
        .000 0... .... .... = Opcode: Standard query (0)
        .... .0.. .... .... = Authoritative: Server is not an authority for domain
        .... ..0. .... .... = Truncated: Message is not truncated
        .... ...1 .... .... = Recursion desired: Do query recursively
        .... .... 1... .... = Recursion available: Server can do recursive queries
        .... .... .0.. .... = Z: reserved (0)
        .... .... ..0. .... = Answer authenticated: Answer/authority portion was not authenticated by the server
        .... .... ...0 .... = Non-authenticated data: Unacceptable
        .... .... .... 0000 = Reply code: No error (0)
    Questions: 1
    Answer RRs: 0
    Authority RRs: 0
    Additional RRs: 0
    Queries
        nothing: type A, class IN
            Name: nothing
            [Name Length: 7]
            [Label Count: 1]
            Type: A (Host Address) (1)
            Class: IN (0x0001)

...
0020  00 02 00 35 ce 87 00 21 3b 8c 00 00 81 80 00 01   ...5...!;.......
0030  00 00 00 00 00 00 07 6e 6f 74 68 69 6e 67 00 00   .......nothing..
0040  01 00 01                                          ...

@mscdex
Copy link
Contributor Author

mscdex commented May 30, 2015

@silverwind I was doing that initially, but IIRC some tests failed when I did that?

@silverwind
Copy link
Contributor

@mscdex maybe those tests are flawed? Remember which ones? I can definitely see the difference, but it might be hard to reproduce on your side, as I think it's an oddity of my dnsmasq configuration that I don't get an NXDOMAIN.

@mscdex
Copy link
Contributor Author

mscdex commented May 30, 2015

@silverwind Do you know which test that is in particular?

@Fishrock123 Fishrock123 added the semver-major PRs that contain breaking changes and should be released in the next major version. label May 30, 2015
@silverwind
Copy link
Contributor

What (DNS) features do we want to support initially?

Here's what I like to see:

  • TTL support. Could be done either as its own method, or as a second array proviced on the resolve callback.
  • Support for arbitrary RRTYPEs. Again, probably best for a seperate method. RRTYPE being defined by name or decimal value. The response data could just be a Buffer containing the answer bytes.

@mscdex
Copy link
Contributor Author

mscdex commented May 31, 2015

@silverwind TTL support is not a problem, we'd just need to figure out the best way to return that information.

I'm curious though about the arbitrary RRTYPE use cases. What did you have in mind?

@silverwind
Copy link
Contributor

I'm curious though about the arbitrary RRTYPE use cases. What did you have in mind?

Well, there's a whole lot of RRTYPEs one can't query for with the current methods, and as new ones get added all the time, it might be too much work to keep up with these. I once wanted to query the SPF RRTYPE (now deprecated in favor of TXT), and I think it should be at least be possible to access the various DNSSEC records so one could implement DNSSEC in userland.

It's a necessary low-level API imho.

@jasnell
Copy link
Member

jasnell commented May 31, 2015

I'd say arbitrary rtypes is a nice to have but not critical. The
extensibility of DNS has been hindered significantly by lack of
implementations supporting extensibility but it tends to not be a problem
in practice. Perhaps a design that does not preclude additional types being
added easily in the future, or even with parsers provided by third party
modules, would be a good approach.
On May 31, 2015 12:42 PM, "Brian White" [email protected] wrote:

@silverwind https://github.com/silverwind TTL support is not a problem,
we'd just need to figure out the best way to return that information.

I'm curious though about the arbitrary RRTYPE use cases. What did you have
in mind?


Reply to this email directly or view it on GitHub
#1843 (comment).

@mscdex
Copy link
Contributor Author

mscdex commented May 31, 2015

@silverwind The problem though is that to use something like DNSSEC, core would have to be able to know about DNSSEC, because there are special interactions (EDNS0) that can occur between RRs and the normal DNS header (e.g. the RCODE being combined with a value in the RR itself to represent the real, extended RCODE). This problem exists both for sending and receiving abitrary RRs (like DNSSEC).

@silverwind
Copy link
Contributor

Back on the ENODATA topic: I think it's fine like it is now (no error and an empty array) on an empty response. The RCODE is NOERROR after all which makes it more conformant with the DNS error model.

But we should clearly communicate this change in the release notes, as I think it may be pretty common for a user to just check for err and then trying to iterate the addresses array, which then fails silently.

@silverwind
Copy link
Contributor

Regarding TTL, I could see changing the signature of resolve to

dns.resolve(hostname[, rrtype][, options], callback)

Possible options:

  • ttl: Include a array of TTLs as an additional callback parameter, in the same order of the addresses
  • servers: Array of servers to use. A nice async alternative to setServers

Regarding the breaking changes, I'm more leaning towards keeping old error codes and ENODATA. While from a technical standpoint both are not optimal, I fear we'd be breaking a lot of things unnecessarily.

@mscdex
Copy link
Contributor Author

mscdex commented Jun 1, 2015

A couple of other differences between this resolver and c-ares:

  • This resolver fails fast on receipt of malformed DNS packets, whereas c-ares will continue to re-query the same server it got the malformed packet from (until the number of tries left reaches zero or timeout happens, whichever happens first).
  • With this implementation, dns.lookup() and dns.resolve() use the same mechanism for resolving except dns.resolve() only considers nameservers and does not take into account the OS hosts file. This means that dns.setServers() affects both functions for example.

@sam-github
Copy link
Contributor

With this implementation, dns.lookup() and dns.resolve() use the same mechanism for resolving except dns.resolve() only considers nameservers and does not take into account the OS hosts file. This means that dns.setServers() affects both functions for example.

Does this mean that lookup isn't calling the native routines, you just parse some of the static entries out of hosts, and the like?

This is pretty different than now, isn't it? The native resolvers are often pluggable, you'll get results from mDNS, sun white pages, NIS, etc. My understanding was lookup existed specifically to be the same as the native resolver, because it used the native resolver. Also, that with pure c-ares/resolution-by-DNS, that users were reporting bugs against node when it didn't return the same results as all the other applications running on their systems.

@mscdex
Copy link
Contributor Author

mscdex commented Jun 1, 2015

@sam-github That's correct, my goal was to avoid having to go to the thread pool.

@silverwind
Copy link
Contributor

lookup must use the system resolver (getaddrinfo), no way around it. Also, I think it must not interact with setServers because how would you reset it once set?

@saghul
Copy link
Member

saghul commented Jun 3, 2015

Was it ever on the table to remove DNS from core? I know, I know it would break things, but stay with me, maybe there is a way around that.

Here is an idea:

  • dns module removed from core
  • current dns module which uses c-ares as a standalone "c-ares" module on NPM
  • net.getaddrinfo / net.getnameinfo exposed

This way, core still has the DNS capability it needs (getaddrinfo basically, if I'm not mistaken), and users have a bunch of options for their DNS needs on NPM:

  • c-ares (formely part of core)
  • jsdns / dnsjs / somethingdns (this very module)
  • getdns (there is already a module!)
  • unbound (someone could write it)
  • udns (someone could write it)
  • knot (someone could write it)

So, c-ares could be released on NPN, and Node could deprecate it (by printing a deprecation warning), then eventually remove it. The deprecation warning for dns.lookup and dns.lookupService would point users at net.getaddrinfo / net.getnameinfo.

And those are my 2 cents!

@mscdex
Copy link
Contributor Author

mscdex commented Jun 4, 2015

@silverwind How about if we encounter nss modules other than dns and files in nsswitch.conf, we load and execute those nss modules in the thread pool? As far as resetting nameservers, we could provide a separate function like resetServers() or similar.

@mscdex
Copy link
Contributor Author

mscdex commented Jun 4, 2015

FWIW I'm -1 on removing dns client functionality from core as I see it as being equally important as http for example.

This fixes code consistency, formatting, and linting issues.
It also arranges the tests alphabetically.
@silverwind
Copy link
Contributor

By the way, would it be hard to support DNS classes other than IN like this one?

$ dig +short @4.2.2.2 version.bind chaos txt
"Version: main/17936"

@mscdex
Copy link
Contributor Author

mscdex commented Jan 12, 2016

@silverwind Probably not, but I'm not sure how common non-Internet requests are these days?

@silverwind
Copy link
Contributor

Pretty uncommon, but the CH class along with the version.bind hostname seems to be supported (and enabled by default) by all major DNS servers, I think we should support it as it can be quite useful for pentesting purposes. There's also the HS class, no idea how useful that one is.

Anyways, I think we should support a signature of dns.resolve(hostname[, rrtype][, class], callback).

More info on those two classes: https://miek.nl/posts/2009/Jul/31/dns-classes/

@silverwind
Copy link
Contributor

Oh, and feel free to ignore my feature requests, these are probably better done after this has landed with the current functionality in place.

@jbergstroem
Copy link
Member

@silverwind even more reason to have this land soon.

@mscdex could you rebase? I'd like to get a CI run going (tried but got conflicts).

@mscdex
Copy link
Contributor Author

mscdex commented Jan 12, 2016

@jbergstroem It might be awhile, I'm working on several different implementations locally while trying to squeeze out as much performance as possible, so my working directory is a little messy at the moment.

@msimerson
Copy link

If we want to incorporate DNS error changes (see #3891) while we are replacing the DNS client, since both would be semver-major anyway as far as I know?

FWIW, +1

The existing c-ares DNS error messages are...obtuse and I'd welcome having to change 100 instances of my programs that check for specific DNS errors in exchange for having real DNS errors.

@mscdex
Copy link
Contributor Author

mscdex commented Mar 12, 2016

I'm not sure if this will pan out as planned.... I can't seem to get latency close enough to the existing dns mechanism. I'm not sure what exactly is to blame as the --prof logs don't really seem to say much.

For example, performing 1000 simple IPv4 resolve()s (serially) takes the existing dns module ~44ms. The lowest I can currently get it down to with the pure js implementation is ~130ms (~115ms if the same UDP socket is used for all of the requests). Obviously performing some in-memory caching could help, but IIRC I don't think the glibc resolver does anything like that itself (there is a separate caching module that can be used however). Additionally I think the pure js resolver would win when it comes to parallel requests, since it's not using the thread pool (as much), but still ...

Also IIRC I don't think even removing the udp module and using the UDP binding directly helps very much. 😕

@MylesBorins
Copy link
Contributor

@mscdex any plans to try and get this in for v6?

@mscdex
Copy link
Contributor Author

mscdex commented Apr 21, 2016

@thealphanerd No. As previously noted I'm not sure it will be possible to get the latency close enough to the existing resolver(s). Having a fair amount of additional latency is probably not something most people are willing to sacrifice when it comes to something like dns resolution.

@jasnell jasnell added the stalled Issues and PRs that are stalled. label Apr 21, 2016
@estliberitas estliberitas force-pushed the master branch 2 times, most recently from 7da4fd4 to c7066fb Compare April 26, 2016 05:22
@benjamingr
Copy link
Member

@mscdex any chance to ship it behind a flag in some v6 minor so people can play with it? Are you still interested in this PR?

@mscdex
Copy link
Contributor Author

mscdex commented Apr 30, 2016

@benjamingr IMHO if we can't get performance to at least be approximately the same as using c-ares, then it's probably not worth making available. To me, dns is one of those things where speed is most important.

I'm interested in it as much as I am interested in the pure js http parser, but if performance isn't there right now, it's not worth integrating yet.

@benjamingr
Copy link
Member

@mscdex are you still working on this?

@jasnell
Copy link
Member

jasnell commented Apr 30, 2016

Tend to agree here. Even behind a flag if the performance isn't there it's not worth getting in.

@mscdex ... this is something that I'd ultimately like to see happen tho... is it something you're still actively pursuing at all or just have it on the back burner?

@mscdex
Copy link
Contributor Author

mscdex commented Apr 30, 2016

@jasnell It's more or less on the back burner for now. I'm still not sure if it will be possible to match the speed from js land, since even using bare UDPWrap does not help (enough). It could entirely be I'm missing something obvious, but I don't know.

@jasnell
Copy link
Member

jasnell commented Aug 6, 2016

@mscdex ... given the stalled status, the merge conflicts, and the uncertainty about this moving forward, I'm going to go ahead and close. Obviously we can reopen if we decide to move forward with this direction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dns Issues and PRs related to the dns subsystem. semver-major PRs that contain breaking changes and should be released in the next major version. stalled Issues and PRs that are stalled.
Projects
None yet
Development

Successfully merging this pull request may close these issues.