Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for DNS hijacking #1209

Closed
wants to merge 3 commits into from
Closed

Conversation

vtemian
Copy link

@vtemian vtemian commented Jun 6, 2017

This is more of a naive approach and maybe the API is too intrusive, but we can change that.

First I started by comparing the hosts, without support for ports (as curl have in its --resolve argument).
In that implementation it made sense to use hijack_dns_resolver, since it map a host to a custom ip.

Since I've added support for ports, maybe hijack_dns_resolver is not the good, but I liked the name so I kept it. I'm open to change it (maybe hijack_resolver or just resolver is better).

vtemian added 3 commits June 6, 2017 21:38
Support custom DNS resolution for a set of hosts.
`hijacked_dns_resolver` is a dict that contains pairs of host and an ip
used to make the connection.
@sigmavirus24
Copy link
Contributor

Hi @vtemian

If this is related to an issue, it'd be helpful if you updated your pull request description with a reference to the issue, e.g.,

Implements #1234

I haven't looked at this yet, but it'd be good to link the discussions if they exist.

Cheers,
Ian

@vtemian
Copy link
Author

vtemian commented Jun 6, 2017

Hi @sigmavirus24,

It doesn't have an issue, just a closed PR (#1188).
Do you think that is better to open an issue or is enough to continue the discussion in this PR?

@codecov-io
Copy link

codecov-io commented Jun 6, 2017

Codecov Report

Merging #1209 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@          Coverage Diff           @@
##           master   #1209   +/-   ##
======================================
  Coverage     100%    100%           
======================================
  Files          21      21           
  Lines        1975    1982    +7     
======================================
+ Hits         1975    1982    +7
Impacted Files Coverage Δ
urllib3/poolmanager.py 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3695c67...55c4dec. Read the comment docs.

Copy link
Contributor

@Lukasa Lukasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for providing a patch to mull over! It's much easier to discuss concrete ways forward in the presence of actual sample code.

So this will work, but it causes me to want to ask a few questions. Firstly, is this the right layer of abstraction? This means essentially that ConnectionPools always know only where they're actually going, which means they can't really lie about HTTP headers. I wonder if this needs to be plumbed all the way down into HTTPConnection so that it can override the socket connection but otherwise do the right thing.

The second question it raises is about API. Is a simple mapping the right approach here? It seems like we have to ask a complicated-enough question that we may want to accept a mapping as the argument but wrap it in a helper object that encapsulates some of the logic.

What do you think?

RequestMethods.__init__(self, headers)

self.hijacked_dns_resolver = hijacked_dns_resolver
if not isinstance(self.hijacked_dns_resolver, dict):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not do this. Anything that implements Mapping is fine, and in fact we shouldn't test for type at all. Let's just do if hijacked_dns_resolver is None: instead.

@haikuginger
Copy link
Contributor

haikuginger commented Jun 7, 2017

One option could be to pull DNS resolution into a pluggable class; have a single class that, by default, does what we do now. It'd be a bit more complicated than passing in a dict, but it could be more adaptable to other purposes long-term.

@Lukasa
Copy link
Contributor

Lukasa commented Jun 7, 2017

The issue with doing it at a module scope is it becomes shared mutable global state, which means we need to lock around it. Python locking is sloooooooooooow, and doing that at the DNS resolution step would be particularly painful as it's so frequent. Passing an object, while annoying in terms of plumbing it through, has the advantage of ensuring that the state is minimally shared.

Though of course I just realised it's not completely un-shared. Hrm. Crap.

Well. Isn't this a turn-up for the books.

I suppose the GIL protects us a bit here so long as we use a dictionary. Maybe anything that implements __getitem__ is GIL-safe? Probably not though. So if we're worried about the thread-safety of this then we do end up in a bit of trouble here.

@haikuginger
Copy link
Contributor

@Lukasa, whoops, used the wrong wording. Did not mean "module" in the specific Python-y meaning of the word.

@Lukasa
Copy link
Contributor

Lukasa commented Jun 7, 2017

Hah, your ninja edit now makes me look like someone who wildly misunderstood you. 😉 I'm happy to be the cranky old man though.

Yeah, I'm open to having a, say, DNSResolver class that defines some strategies for how to resolve and that manages this. This DNSResolver class could be auto-created as needed, but allow for specific overrides if people need it. How does that sound?

@haikuginger
Copy link
Contributor

haikuginger commented Jun 7, 2017

Yeah, that's what I was thinking. Something like this:

class ResolveMyFakeDomain(DnsResolver):

    domain_mapping = {
        'mydomain.com': '127.0.0.1'
    }

    def resolve(self, request):
        # See if the request domain is one we want to hijack
        domain = self.domain_mapping.get(request.domain)
        if domain is None:
            # Call out to the superclass that has actual DNS logic
            domain = super(ResolveMyFakeDomain, self).resolve(request)
        return domain

class FakeResolvingPoolManager(PoolManager):
    
    dns_resolver = ResolveMyFakeDomain

__init__ for PoolManager (or wherever) would also obviously need to be able to take an initialized DnsResolver object.

@sigmavirus24
Copy link
Contributor

I'd rather have a DNSMapping class such that you could do:

dnsmapping = DNSMapping()
dnsmapping.add(ip='127.0.0.1', hostname='google.com', port=80)

And have whatever Resolver use that. This means we can hide the underlying mapping from the user and since this is an advanced enough feature, it's not unreasonable to ask those users to build up their mapping this way.

@vtemian
Copy link
Author

vtemian commented Jun 7, 2017

I don't think that an explicit mapping is required. Extending a DNSResolver looks pretty neat. One can use an external data source to solve the mapping (like redis, rest api etc) via custom business logic that can be hidden in that resolve method

@sigmavirus24
Copy link
Contributor

@vtemian Having an explicit mapping object will allow us to expand things without having to parse strings. The thing I dislike most about this PR is the idea of

{
    'example.com:443': '127.0.0.1',
}

If we want to only route requests to example.com on port 443, then let's make that explicit and hard to get wrong. If we have other rules we later want to add, extending the mapping object is better than trying to parse something that has no standard.

@theacodes
Copy link
Member

Thank you for creating this pull request, unfortunately, for one reason or
another it has become stale. We are declaring PR bankruptcy (see #1370) to
better focus our limited volunteer resources. If you still think this PR is
relevant and useful, please comment and let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants