Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix runtime config hot reload #150

Closed
ysawa0 opened this issue Jun 11, 2020 · 6 comments
Closed

Fix runtime config hot reload #150

ysawa0 opened this issue Jun 11, 2020 · 6 comments

Comments

@ysawa0
Copy link
Member

ysawa0 commented Jun 11, 2020

Relates to #132

I'm not sure if the current implementation is intended but I've noticed config hot reloading is not working as expected (locally on Docker and with k8s ConfigMaps)

After some debugging I traced the issue down to — SymLinkRefresher watches only the base runtime path instead of RuntimePath/RuntimeSubdirectory/config
https://github.com/lyft/goruntime/blob/master/loader/symlink_refresher.go#L10

To fix this I would like to:

  1. Update goruntime package being used to latest. From 0.2.1 ⇒ 0.2.4
  1. After 1, I'd like to switch SymLink to DirectoryRefresher in https://github.com/envoyproxy/ratelimit/blob/master/src/server/server_impl.go#L172-L176 and update the paths being passed in so hot reload works.

If Lyft maintainers want to chime in and say if it's working as intended for them, I can add this as a configurable option instead of permanent change.

@mattklein123
Copy link
Member

If Lyft maintainers want to chime in and say if it's working as intended for them, I can add this as a configurable option instead of permanent change.

As long as there is no user visible change to existing behavior (what we have today continues to work) this is fine, otherwise it needs to be configurable. I don't 100% understand the problem but I would assume that the code can become a superset of what is watched and handle any other cases that are missed.

@ysawa0
Copy link
Member Author

ysawa0 commented Jun 11, 2020

@mattklein123 I guess my question for you would be, is runtime config reloading working for your team currently? I wouldn't want to brake anything for you if it's working fine already for your use case.
Whenever I made changes in RuntimePath/RuntimeSubdirectory/config (by updating symlinks or updating a file directly) I didn't see those changes being picked up by ratelimit service.

@mattklein123
Copy link
Member

Yes it works currently. There have been similar issues fixed in Envoy around how K8s does filesystem watching, so I assume this is a related issue. See envoyproxy/envoy#6215 and envoyproxy/envoy#10163 for example.

@dweitzman
Copy link
Contributor

dweitzman commented Jun 12, 2020

I'm not particularly familiar with the libraries involved, but at a glance my impression is that DirectoryRefresher may not behave how you're hoping or how people might expect if they're under the impression that it does general in-place config reloading for whatever may change.

I may be wrong, but it looks at a glance like:

  • DirectoryRefresher wouldn't notice a deleted file, only newly created files or modified files
  • DirectoryRefresher would only notice files being added or modified within RuntimePath/RuntimeSubdirectory, but not within RuntimePath/RuntimeSubdirectory/config/. fsnotify doesn't seem to claim to natively support recursive directory watches yet (User-space recursive watcher fsnotify/fsnotify#18) and goruntime doesn't appear to be trying to support that either
  • DirectoryRefresher would probably also not catch changes to RuntimePath/, which would likely break existing deployments that depend on the SymlinkRefresher behavior which watches RuntimePath/

I'm new to this codebase, so please ignore me if I'm completely wrong

@ysawa0
Copy link
Member Author

ysawa0 commented Jun 12, 2020

Okay, I think I understand how Lyft is performing the runtime updates: the symlink to root RuntimePath is updated, which does trigger SymlinkRefresher and config reload.

Problem is, I think other users are updating contents in RuntimePath/RuntimeSubdirectory/config/ which doesn't trigger any reload but expecting it to.

The fix I put into our build to get this working is to update the loader in https://github.com/envoyproxy/ratelimit/blob/master/src/server/server_impl.go#L190-L195
to:

	ret.runtime = loader.New(
		filepath.Join(s.RuntimePath, s.RuntimeSubdirectory),
		"config",
		ret.store.Scope("runtime"),
		&loader.DirectoryRefresher{},
		loaderOpts...)

such that the config dir is watched directly.

@dweitzman you're absolutely right. fsnotify does not watch recursively and updating to DirectoryRefresher would break the existing deployments you mentioned. Though looks like deletes are not noticed by either Symlink or DirectoryRefresher.

I think solution here is to introduce a new config allowing user to choose either of the reload models as well as updating docs with more details on how configs can be updated. What do you think?

@mattklein123
Copy link
Member

I think solution here is to introduce a new config allowing user to choose either of the reload models as well as updating docs with more details on how configs can be updated. What do you think?

Yup config driven SGTM, thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants