Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caddy Hangs Frequently #1369

Closed
martindale opened this issue Jan 24, 2017 · 12 comments
Closed

Caddy Hangs Frequently #1369

martindale opened this issue Jan 24, 2017 · 12 comments

Comments

@martindale
Copy link

martindale commented Jan 24, 2017

1. What version of Caddy are you running (caddy --version)?

0.9.4

2. What are you trying to do?

Run Caddy reliably

3. What is your entire Caddyfile?

can provide privately, hostnames will need to be obscured

4. How did you run Caddy (give the full command and describe the execution environment)?

pm2 start caddy

5. What did you expect to see?

A working web server!

6. What did you see instead (give full error messages and/or log)?

Caddy occasionally hangs, timing out when web requests are sent.

7. How can someone who is starting from scratch reproduce this behavior as minimally as possible?

I'm not sure. It's unpredictable so far.

@mholt
Copy link
Member

mholt commented Jan 24, 2017

Thanks for the report. Unfortunately this is entirely unactionable as-is, so I will be closing this.

I want to help, but there's nothing here. Can't you even reveal how many sites are hosted on it, whether it's HTTP or HTTPS, how long it has been running, what the output of curl -v or s_client is, anything? Traffic levels? There's no way every single line of the Caddyfile is private. Is it serving static files, proxying? Are you using it to websockets to rovers on Mars? I need to know. What are the requests? Even if the behavior is sporadic, you know those, surely. Logs? Anything.

Also, make sure to use the new Caddy release going out tomorrow. It fixes a deadlock which might have something to do with what you're seeing, but it's literally impossible to know at this point. Still worth a shot. 😁

@mholt mholt closed this as completed Jan 24, 2017
@martindale
Copy link
Author

martindale commented Jan 24, 2017

Thanks for your help, @mholt. I'd like to continue updating this issue if it persists, if you don't mind? I'll wait for the updated release. Mind linking me to the issue for the deadlock? I can do some debugging / stack tracing if you'd like.

@mholt
Copy link
Member

mholt commented Jan 24, 2017

Yep! If it's what I suspect (again, having no real grounds for it), it's a duplicate of this: #1157

Fixed in: #1366

If, after the update, your server continues to exhibit the same behavior or you are confident it is something different, feel free to re-open with more details. Enabling pprof and getting a stack trace of all the goroutines would be helpful as well.

@martindale
Copy link
Author

Thanks again, @mholt! I've deployed 0.9.5 and will report back with updates within two weeks.

@martindale
Copy link
Author

So far so good, with one exception: this upgrade caused hosts with WebSocket configurations to continuously cycle connect/reconnect. @mholt any insight into this? I saw a mention of improved support in the release notes.

Configuration file looks something like this:

soundtrack.io:80 {
  redir https://soundtrack.io{uri}
}

*.soundtrack.io:80 {
  redir https://{hostonly}{uri}
}

soundtrack.io:443 {
  tls /FULL_PATH_OBSCURED/soundtrack.io.crt /FULL_PATH_OBSCURED/soundtrack.io.key
  proxy / localhost:13000 {
    websocket
    header_upstream Host soundtrack.io
  }
}

*.soundtrack.io:443  {
  tls /FULL_PATH_OBSCURED/soundtrack.io.crt /FULL_PATH_OBSCURED/soundtrack.io.key
  proxy / localhost:13000 {
    websocket
    header_upstream Host {hostonly}
  }
}

I've also tried with -http2=false as per some original documentation, to no avail. Please advise!

@mholt
Copy link
Member

mholt commented Jan 27, 2017

The release notes very prominently describe the new timeouts: https://caddyserver.com/docs/timeouts

You should probably raise or disable them if you have long-lived connections, like websockets.

Glad it's working for you now!

@martindale
Copy link
Author

So far. It happens once a week or so, and we've restarted the instance since then. We'll have to wait to be sure.

I have over 100 hosts, does timeout need to be added individually to each?

@mholt
Copy link
Member

mholt commented Jan 27, 2017

No, if you read the docs it says you can just set it once per listener.

@martindale
Copy link
Author

I've read the docs; they are unclear. Trying to add timeout to the top-level so that it applies to all hosts seems to break the config (Parse error: Unknown directive 'some.host'), and having to make configuration changes to every single host every time we upgrade is getting tedious. Perhaps following the semantic versioning guide would be useful here with regards to mandating configuration changes.

I've changed the timeout for one host to 1m to test, but it seems the connections are still getting restarted. Adding websocket to a host's configuration really should be sufficient, if we don't want WebSockets to be made available by default (which I believe they should be). Are we sure the read/write timers are being calculated correctly?

@mholt
Copy link
Member

mholt commented Jan 27, 2017

Trying to add timeout to the top-level so that it applies to all hosts seems to break the config

Yeah, you can't do that. :) It sounds like you might be new to Caddy. I suggest reading https://caddyserver.com/docs/caddyfile for information about how the Caddyfile syntax works.

Perhaps following the semantic versioning guide would be useful here with regards to mandating configuration changes.

We generally do, but shifted down a version number until version 1.0. Until then, I'm not afraid to make breaking changes, and that's stated in the wiki. By 1.0, we should have a good definition of what breaking and non-breaking means.

Are we sure the read/write timers are being calculated correctly?

We use the Go standard library. See golang/go#16100 for more information.

@martindale
Copy link
Author

The trouble seems to be that the timer isn't incremented, even if data is sent over the WebSocket. Results in the WebSocket being dropped consistently. Is this the desired behavior?

@mholt
Copy link
Member

mholt commented Jan 27, 2017

Not sure -- Go is handling the timeouts for us. We just set the values on the http.Server. You can take the issue up at golang/go there if you think it is a bug (it might be -- or it just might be more complicated than it seems from the surface). That issue I linked to will get you started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants