Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A way to get actual client IP #4924

Closed
arabcoders opened this issue Jul 31, 2022 · 17 comments
Closed

A way to get actual client IP #4924

arabcoders opened this issue Jul 31, 2022 · 17 comments
Labels
feature ⚙️ New feature or request help wanted 🆘 Extra attention is needed

Comments

@arabcoders
Copy link

arabcoders commented Jul 31, 2022

Hello,

It seems using request_header directive to alter request headers does not make it to the log directive, i mainly use the transformer plugin to alter the logged IP, some requests are via CDN others directly, and it seems caddy does not always set X-Forwarded-For which i think is probably the intended behavior. Anyway here is an example of the behavior i am describing

{
	admin 127.0.0.1:20199
	debug
	log stdout
}

:7885 {
	log

	@no_forawrded_for not header X-Forwarded-For *
	request_header @no_forawrded_for X-Forwarded-For {remote_host}

	respond {header.X-Forwarded-For} 200
}

as you can see from the output, it's clearly being set.

$ http http://localhost:7885

HTTP/1.1 200 OK
Content-Length: 9
Date: Sun, 31 Jul 2022 19:14:12 GMT
Server: Caddy

127.0.0.1

However, the logger does not show the header.

2022/07/31 19:14:05.744 INFO    serving initial configuration
2022/07/31 19:14:05.744 INFO    tls     finished cleaning storage units
2022/07/31 19:14:05.744 INFO    watcher watching config file for changes        {"config_file": "./Caddyfile"}
2022/07/31 19:14:12.440 INFO    http.log.access handled request {"request": {"remote_ip": "127.0.0.1", "remote_port": "56400", "proto": "HTTP/1.1", "method": "GET", "host": "localhost:7885", "uri": "/", "headers": {"User-Agent": ["HTTPie/1.0.3"], "Accept-Encoding": ["gzip, deflate"], "Accept": ["*/*"], "Connection": ["keep-alive"]}}, "user_id": "", "duration": 0.0001012, "size": 9, "status": 200, "resp_headers": {"Server": ["Caddy"], "Content-Type": []}}

i expect to see the added header to show up, however it seems the log directive receive the unmodified request object. If that's the intended behavior, then i apologize.

I have log analyzers and they are failing due to missing IP in logs as some requests originating locally or someone hitting the IP directly instead of the frontend proxies.

@francislavoie
Copy link
Member

It seems using request_header directive to alter request headers does not make it to the log directive

That's by design. The logs show the original headers as Caddy received them. The log entry is provisioned before any handling happens.

To clarify, the log directive is not an HTTP handler, it's just a configuration point to enable and configure access logging. Run caddy adapt --pretty --config /path/to/Caddyfile to see the underlying JSON config, where that will be made clearer.

@mholt mholt closed this as not planned Won't fix, can't repro, duplicate, stale Jul 31, 2022
@mholt
Copy link
Member

mholt commented Jul 31, 2022

No apology needed. 😃

@arabcoders
Copy link
Author

arabcoders commented Jul 31, 2022

@francislavoie @mholt Thank you for your prompt responses.

Do you think it's possible to achieve what i want using any currently available method?

right now i can only think of the following possible means all probably involve caddy modifications either via plugin or official support.

  • Caddy provide something like (request>remote_ip} but it takes into account valid X-Forwarded-For header. i.e {request>client_ip}. Right now caddy does it for the remote_ip directive using the @matcher remote_ip forwarded logic.
  • A modification to the transformer plugin to support alternative default i.e {request>headers>X-Forwarded-For>[0]:-request>remote_ip}
  • a way to add context to the log. the only possible way currently is by adding response header and parsing that in log, however this leaks user ip which i don't think is good idea.
  • support request matchers in log context i.e log @hasforwarded.

@mholt
Copy link
Member

mholt commented Aug 1, 2022

If I understand correctly, you want to emit a value to your log entries that basically is the result of this Javascript-like-pseudocode: {header.X-Forwarded-For} || {request>remote_ip} - is that right?

@arabcoders
Copy link
Author

If I understand correctly, you want to emit a value to your log entries that basically is the result of this Javascript-like-pseudocode: {header.X-Forwarded-For} || {request>remote_ip} - is that right?

Indeed thats what i want.

@mholt mholt added the feature ⚙️ New feature or request label Aug 1, 2022
@mholt mholt reopened this Aug 1, 2022
@mholt mholt changed the title request_header changes not propagating to log directive A way to get actual client IP Aug 1, 2022
@mholt mholt added the help wanted 🆘 Extra attention is needed label Aug 1, 2022
@arabcoders
Copy link
Author

Thanks for taking time to consider this. Is there any specific approach you would like to see this implemented in?

i have little golang experience i may be able to contribute something if it's not too deep.

@mholt
Copy link
Member

mholt commented Aug 2, 2022

Thanks for offering!

Well, the implementation itself should be simple enough. But I think we may need to answer some questions first, like how to know whether to trust the X-Forwarded-For header. The reverse proxy has configuration to allow the header only from certain clients. We would need something similar for this mechanism to be trustworthy. That will require some thought/discussion.

@arabcoders
Copy link
Author

arabcoders commented Aug 2, 2022

Thinking out loud here, if i was to propose a caddy wide solution i think something like this might be appropriate.

  • Deprecate/remove @matcher remote_ip forwarded it feels out if place to be honest.
  • remote_ip should serve as it is true pointer to remote_addr.
  • Introduce trusted_proxy as global directive. instead of just reverse_proxy sub-directive. Which can be overridden using reverse_proxy sub-directive. This might be too much. If so using the current implementation should suffice.
  • Add new directive client_ip which act exactly like remote_ip. if no trusted_proxy is found it should report remote_addr. Otherwise use the current logic to determine @matcher remote_ip forwarded value and use that.

Of course this probably huge undertaking, but i think it's the correct way.

On personal note, having alternative value for the transformer plugin is enough for majority of the cases. For example, {request>headers>x-forwarded-for>[0]:-request>remote_ip} I think the default/alternative value already supported for environment variables.

@mholt
Copy link
Member

mholt commented Aug 4, 2022

That actually sounds like a pretty good solution. Definitely some work there, but would be happy to review a PR for it. (Am a little busy currently!)

@arabcoders
Copy link
Author

arabcoders commented Aug 5, 2022

That actually sounds like a pretty good solution. Definitely some work there, but would be happy to review a PR for it. (Am a little busy currently!)

Sorry for the late reply, dealing with COVID infection is not fun, I'll see if i can make PR for it

@francislavoie
Copy link
Member

I have other priorities to focus on first, but I can work on this later. My rough plan:

  • We can add trusted_proxies to http.servers, and on Provision just like reverse_proxy, it would parse the CIDRs to a more efficient slice.
  • Add trusted_proxies to the Caddyfile servers global options (which puts it in the JSON config)
  • Early in http.Server.ServeHTTP, calculate the client IP, store it in the request context and the replacer
  • The server's trusted proxies can get pulled from ServerCtxKey by any handler (including reverse_proxy, by default).
  • Maybe implement a client_ip request matcher (similar to remote_ip but on the computed client IP)
  • Consider adding client_ip to the request logs 🤔 might be duplicate of remote_ip a lot of the time, but it would be handy for situations where it isn't a duplicate, I guess.
  • Consider adding an option to http.servers to overwrite req.RemoteAddr with the computed client IP if trusted; the logs would still have remote_ip be the original connecting IP, but this would make it transparent to any other handler what the original client IP was... this has some problematic implications that need to be carefully considered though, like how reverse_proxy will handle that, if that option is on then it shouldn't get confused between which is the right one to use, etc.

@arabcoders
Copy link
Author

@francislavoie Thank you, Sadly my golang and caddy internal experience is too weak to attempt this. So, instead i switched my focus to attempt to add support for defualt/alternative value for the transformer plugin.

@francislavoie
Copy link
Member

FYI @arabcoders I implemented the first half of what I was talking about in #5103

@arabcoders
Copy link
Author

FYI @arabcoders I implemented the first half of what I was talking about in #5103

Thank you this is really great

@francislavoie
Copy link
Member

I rebased and updated #5103 and #5104 which cover this feature.

@robgordon89
Copy link

@francislavoie thanks for your work on this, this is going to be epic 🥳

@francislavoie
Copy link
Member

This can be closed, we merged the aforementioned pull requests which implement first-party client IP support!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature ⚙️ New feature or request help wanted 🆘 Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants