Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default timeouts are quite low. #1464

Closed
johnnylee opened this issue Feb 20, 2017 · 16 comments
Closed

Default timeouts are quite low. #1464

johnnylee opened this issue Feb 20, 2017 · 16 comments
Assignees
Labels
discussion 💬 The right solution needs to be found

Comments

@johnnylee
Copy link

(0.9.5) I've been using Caddy as a reverse proxy to provide tls, logging, and gzip to back-end services. I was seeing seemingly random file upload failures with code 502 and an error message client disconnected in the log. Eventually I figured out that the requests were timing out.

I think the solution is to have either higher timeout values by default or an explicit log message when a timeout occurs.

Thanks,
JohnnyLee

@tobya
Copy link
Collaborator

tobya commented Feb 20, 2017

@johnnylee Was there any timeout message in your -log destination? Can you post here?

@mholt
Copy link
Member

mholt commented Feb 20, 2017

The default timeouts are set at what they are because, in my local testing with a slowloris tool, they were the values that mitigated the effects of the attacks the best. Raising them will make servers more vulnerable to resource depletion (depending on exact system configuration and specs).

I hope that people are reading the release notes before they upgrade.

Timeout events should be logged, as @tobya mentioned. I see log messages mentioning "timeout".

And like I've said before, this seems to be frustrating to a number of people, but I don't know what's worse: being frustrated because it's secure by default or being frustrated because your server is attacked with slowloris. I'm not sure that slowloris attacks are common, though, so I'm not sure what to do. Secure by default is kind of the Caddy philosophy...

@mholt mholt added the discussion 💬 The right solution needs to be found label Feb 20, 2017
@johnnylee
Copy link
Author

@tobya - There wasn't any message in my -log file. The only message was from Caddy itself, which was the client disconnected message:

Feb 20 14:26:39 ip-172-30-1-19 caddy[9448]: 20/Feb/2017:14:26:39 +0000 [ERROR 502 /u/398994285254414337/] client disconnected

I think if it were logged, then it wouldn't have been an issue.

@pwaller
Copy link

pwaller commented Feb 20, 2017

I don't know what's worse: being frustrated because it's secure by default or being frustrated because your server is attacked with slowloris. I'm not sure that slowloris attacks are common, though, so I'm not sure what to do.

It's a tough one. Particularly if you're on the receiving end of a problem and it's not blindingly obvious what is wrong (see #1422).

It seems a tough tradeoff. As a visitor of the internet, I have a 4G mobile connection and I work for a couple of hours from the train every day. Will this timeout prevent me from downloading large files from caddy servers when my internet connection is running at 10kiB/sec for a few minutes before returning to 100-1000kiB/sec? If so, that's quite annoying.

@pwaller
Copy link

pwaller commented Feb 20, 2017

@johnnylee "The only message was from caddy itself", you mean, caddy was proxying to {another app}, and {another app} sees caddy disconnect. But caddy itself doesn't make it clear that the client timed out? Yeah, I hit that too over in #1422.

@johnnylee
Copy link
Author

johnnylee commented Feb 20, 2017

@pwaller I mean the the application didn't log anything at all. As far as I can tell the request never reached the handler of the application. The message was in my /var/daemon.log file (caddy run via systemd).

@pwaller
Copy link

pwaller commented Feb 20, 2017

Please ignore me. I missed that that this was happening during uploads.

@mholt
Copy link
Member

mholt commented Feb 20, 2017

@johnnylee So Caddy logged "client disconnected" but the application did not log anything?

The "client disconnected" message is emitted when an HTTP/2 connection closes: https://golang.org/src/net/http/h2_bundle.go?h=http2errClientDisconnected

My tests were with plain HTTP, so HTTP/1.1, which emitted a "i/o timeout": https://golang.org/search?q=timeoutError

@johnnylee
Copy link
Author

@mholt That's correct. I put a log message at the top of the application's handler to see if it was being called, and it didn't show up in the log. I would guess it has to do with some buffering occurring during the proxying process.

@mholt
Copy link
Member

mholt commented Feb 20, 2017

@pwaller Your point is still valid, the timeouts are for both reads and writes, uploads and downloads.

It's difficult to distinguish between genuinely slow networks and slowloris clients.

The humanity in me thinks that Caddy should not have timeouts by default: assume that clients are benevolent, and that slow connections are just slow networks. In a sense, this serves a greater population that does not have access to high speed internet (even in urban, developed areas, like cell networks). Slowloris attacks are probably much less common than slow networks.

Slowloris is a type of DoS attack. Frankly, Caddy can't prevent DoS attacks (especially DDoS). No web server alone can really do that. You need something like Cloudflare to mitigate those.

One possibility in the future of "default timeouts" would be to have a "smart timeout" feature that kills connections that are uncharacteristically long or have unusually low transfer rates (connection duration vs. transfer rate are different! - one measures flow of bytes over time, allowing big file downloads, the other is just a timer). This might be a safer default, but not always. So the site owner might still have to disable timeouts.

On the other hand, the smaller part of my thinking is stubborn. It says that Caddy is secure by default and should keep the timeouts.

But I think I might disable them for default for now.

@mholt mholt self-assigned this Feb 20, 2017
@pwaller
Copy link

pwaller commented Feb 21, 2017

One possibility in the future of "default timeouts" would be to have a "smart timeout"

I suppose the thing about Slowloris is that it allows one attacker to starve a server with just one measily low bandwidth net connection. If that's the attack you want to defend against, you could also do so by limiting the number of long ongoing connections from one source. This needs to be done with care because sometime entire countries can be behind an IPv4 address. So one needs to think carefully about what it is they are trying to defend against and how that might harmfully impact harmless use-cases. But maybe there is a useful metric which distinguishes slowloris-style problems.

Another style of thing that could be done is to make the timeout dependent on the resource utilization. For instance, if we only have a handful of connections (let's say, 100), that's no problem, we can sustain those. But if we have 10,000, maybe that's time to say "sorry, right now we're not handling longer requests".

A problem with such an approach is that it is somewhat magical and could lead to things breaking in a way that is very hard to diagnose.

@tobya
Copy link
Collaborator

tobya commented Feb 21, 2017

Is there a possibility of a set of presets such as php for fast cgi that sets particular values

E.g.

Timeout secure

Or

Timeout strict

Which would set values to prevent slowloris attacks, this would allow for a simple alternative to defaults

@mholt
Copy link
Member

mholt commented Feb 21, 2017

@pwaller I agree, I like simple better. I'd have to give it a lot more thought before anything else gets implemented there.

@tobya I guess, but saying timeouts 10s is basically the same thing and less obscure.

@mholt mholt closed this as completed in f49e0c9 Feb 22, 2017
@mholt
Copy link
Member

mholt commented Feb 22, 2017

I've disabled default timeouts. in that commit. Also because there is some room for improvement in how the standard lib does timeouts (as pointed out above), let's give it some time before we look into enabling them again: at least the issues above should be fixed (Go 1.9 maybe) first.

@pwaller
Copy link

pwaller commented Feb 22, 2017

@mholt thanks. I'm a little confused by the way about which Go1.9 maybe issues you are referring to. I couldn't find them mentioned in this issue.

@mholt
Copy link
Member

mholt commented Feb 22, 2017

@pwaller Oops, they're linked over in #1422 - I refer to golang/go#18437 and also golang/go#16100.

AluisioASG added a commit to AluisioASG/snowweb that referenced this issue Apr 28, 2021
A [piece by Bojan Živanović][1] showed up on my reading list yesterday,
which covered four points of a production-ready http.Server:
- Timeouts
- Systemd socket activation
- TLS
- Graceful shutdown

Of those, only the first one isn't already handled in SnowWeb.  We're
referred to [another post on Cloudflare's blog][2] for more details,
but it doesn't explain how the suggested timeouts were decided upon.

Looking over at [what my current web server does][3], Caddy has a
timeout of five minutes in-between requests, and it [used to have][4]
read and write timeouts which [broke slow clients][5].  We can't have
a write timeout due to the reload API endpoint which can take an
arbitrarily long time time respond and because it'd cut large downloads,
but since we're a static file server and shouldn't be receiving uploads
at all, we can be less lenient while reading the request.

Implementing these timeouts protects against slowloris attacks (hereby
known also as ASLSP-over-HTTP).

[1]: https://bojanz.github.io/increasing-http-server-boilerplate-go/
[2]: https://blog.cloudflare.com/exposing-go-on-the-internet/
[3]: https://caddyserver.com/docs/caddyfile/options#timeouts
[4]: caddyserver/caddy#1368
[5]: caddyserver/caddy#1464
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion 💬 The right solution needs to be found
Projects
None yet
Development

No branches or pull requests

4 participants