-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce heap memory usage #31
Conversation
bc765f7
to
b2ecdcf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM; Excellent optimisations. Only a few minor comments.
@@ -32,7 +33,7 @@ func init() { | |||
|
|||
snapshotsCmd.AddCommand(snapshotsDumpCmd) | |||
snapshotsDumpCmd.Flags().StringP("format", "f", "debug", | |||
"Output format, one of: 'debug' (default), 'text'") | |||
"Output format, one of: 'debug' (default), 'text' (same)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to cater to a future expansion to allow for multiple formats, perhaps have debug
as its own flag so you can specify the level of detail (normal, debug) and the format (text, some other format)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for this switch was a bit different.
In 0.3.0, you could use two output formats:
- 'text' is a google protobuf encoding that is reversible to binary, but it makes all the key-values unreadable
- 'debug' shows the all the key-value entries in a more friendly way
Now that we drop the generated protobuf code, we also loose the 'text' encoder'. I decided to just make this the same as 'debug', instead of dropping the 'text' option, because I figured it would be less likely to break anything.
d.r.snapshotsByInstance[d.instance] = snapshot.Update{ | ||
Snapshot: msg, | ||
NameInfo: ni, | ||
OnClose: func(u *snapshot.Update) { | ||
if u.Snapshot == nil { | ||
return // already called? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it will not be reached, but if it ever does it looks like a potential risk of keeping a token without release()-ing it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two ways we can handle calling Close() twice:
- Ignore a second call.
- panic()
The advantage of the first is that you can always defer u.Close()
, even if you explicitly call it somewhere. Having this option may reduce the chance that you forget to call it in the first place.
The second one us strict about you calling it only once. This may detect incorrect logic in some cases, but actually increases the chance that you forget to call it in some code paths.
In this case I picked option 1, because it seemed like the least bad option.
The bigger issue happens when Close is not called. Perhaps we should add a runtime.SetFinalizer
to detect that scenario, and then panic?
Limit number of downloaded and decompressed snapshots waiting to be loaded, to reduce the memory usage. To download or decompress a snapshot, you need to hold a token that is in limited supply, configurable with two new `memory_*` options.
The test was failing with the new memory limit fallback to 1. This also removes the dependency on magic sleep times, and reduces the new snapshots check interval for a faster test run.
We do not need this index, it was already commented out.
3413bda
to
b51c6b8
Compare
It should not spell check variable names, this is just too annoying. (...Spell check...) Error: snapshot/dbi.go:8:26 ... 33, Error - `csproto` is not a recognized word. (unrecognized-spelling) Error: snapshot/dbi.go:28:25 ... 33, Error - `reallocs` is not a recognized word. (unrecognized-spelling) Error: snapshot/dbi.go:117:44 ... 51, Error - `varints` is not a recognized word. (unrecognized-spelling) Error: snapshot/dbi_test.go:36:2 ... 8, Error - `pbdata` is not a recognized word. (unrecognized-spelling) Error: snapshot/dbi_test.go:88:55 ... 63, Error - `Mentries` is not a recognized word. (unrecognized-spelling) Error: snapshot/doc.go:4:40 ... 55, Error - `deserialisation` is not a recognized word. (unrecognized-spelling) Error: snapshot/gogosnapshot/compat_test.go:1:9 ... 21, Error - `gogosnapshot` is not a recognized word. (unrecognized-spelling) Error: snapshot/gogosnapshot/compat_test.go:15:2 ... 6, Error - `popr` is not a recognized word. (unrecognized-spelling) Error: snapshot/gogosnapshot/snapshot.pb.go:1005:2 ... 6, Error - `tmps` is not a recognized word. (unrecognized-spelling) Error: syncer/receiver/receiver.go:11:47 ... 53, Error - `climit` is not a recognized word. (unrecognized-spelling) Error: syncer/sync_test.go:163:38 ... 44, Error - `dbimsg` is not a recognized word. (unrecognized-spelling) Error: syncer/utils.go:100:28 ... 40, Error - `optimisation` is not a recognized word. (unrecognized-spelling) Error: snapshot/dbi.go:122:13 ... 20, Error - `csproto` is not a recognized word. (unrecognized-spelling) Error: snapshot/dbi.go:122:57 ... 64, Error - `csproto` is not a recognized word. (unrecognized-spelling) Error: snapshot/dbi.go:123:13 ... 20, Error - `csproto` is not a recognized word. (unrecognized-spelling) Error: snapshot/dbi.go:127:13 ... 20, Error - `csproto` is not a recognized word. (unrecognized-spelling) Error: snapshot/dbi_test.go:38:27 ... 33, Error - `pbdata` is not a recognized word. (unrecognized-spelling) Error: snapshot/dbi_test.go:43:24 ... 30, Error - `pbdata` is not a recognized word. (unrecognized-spelling) Error: snapshot/dbi_test.go:106:55 ... 63, Error - `Mentries` is not a recognized word. (unrecognized-spelling) Error: snapshot/gogosnapshot/compat_test.go:16:31 ... 35, Error - `popr` is not a recognized word. (unrecognized-spelling) Error: snapshot/gogosnapshot/regenerate.go:1:9 ... 21, Error - `gogosnapshot` is not a recognized word. (unrecognized-spelling) Error: snapshot/gogosnapshot/snapshot.pb.go:4:9 ... 21, Error - `gogosnapshot` is not a recognized word. (unrecognized-spelling) Error: snapshot/gogosnapshot/snapshot.pb.go:321:34 ... 46, Error - `gogosnapshot` is not a recognized word. (unrecognized-spelling) Error: snapshot/gogosnapshot/snapshot.pb.go:322:35 ... 47, Error - `gogosnapshot` is not a recognized word. (unrecognized-spelling) Error: snapshot/gogosnapshot/snapshot.pb.go:1007:3 ... 7, Error - `tmps` is not a recognized word. (unrecognized-spelling) Error: snapshot/gogosnapshot/snapshot.pb.go:1009:16 ... 20, Error - `tmps` is not a recognized word. (unrecognized-spelling) Error: snapshot/load.go:21:52 ... 60, Error - `reallocs` is not a recognized word. (unrecognized-spelling) Error: snapshot/load.go:49:52 ... 60, Error - `reallocs` is not a recognized word. (unrecognized-spelling) Error: snapshot/load_test.go:33:45 ... 53, Error - `Mentries` is not a recognized word. (unrecognized-spelling) Error: syncer/receiver/receiver.go:36:30 ... 36, Error - `climit` is not a recognized word. (unrecognized-spelling) Error: syncer/receiver/receiver.go:41:26 ... 32, Error - `climit` is not a recognized word. (unrecognized-spelling) Error: syncer/receiver/receiver.go:79:29 ... 35, Error - `climit` is not a recognized word. (unrecognized-spelling) Error: syncer/receiver/receiver.go:80:29 ... 35, Error - `climit` is not a recognized word. (unrecognized-spelling) Error: syncer/utils.go:116:37 ... 45, Error - `reallocs` is not a recognized word. (unrecognized-spelling) (...Compare expect with new output...)
Several changes to reduce the maximum heap memory usage of LS.
The biggest change is to switch from a generated gogo-protobuf parser to a custom parser. This new parser uses a light low-level abstraction of the protobuf wire format to avoid allocating a large
[]KV
slice with all items. EveryKV
item in this slice used 64 bytes, plus reallocation overhead as items are appended, plus (in 0.3.0) the copied key and value data as individual allocs, plus GC overhead. Instead, we now directly parse the protobuf wire data as we loop over the items one by one. We also take snapshots directly to protobuf wire format, without an intermediate slice representation.We now limit the number of snapshots that can concurrently be held in memory using the new
memory_downloaded_snapshots
andmemory_decompressed_snapshots
configuration options. In the past this number could grow up to the number of unique instance names.We now preallocate byte slices that are large enough to hold all the data that we will generate. This avoids many subsequent slice reallocations and copies to grow a slice, doubling the size every time.
Finally, we immediately force a garbage collection cycle as soon as we have released a large block of memory.
In local testing with 1M test domains and 6M records and snapshots from 3 instances, this reduced the peak RSS from over 10 GB (in v0.3.0) to under 3 GB. With more instances, the savings should be even higher with the new limits in place.
To verify that we are writing and reading the protobuf format correctly, this PR adds tests that perform roundtrips with the old gogo-protobuf code, and verify that the implementation is correct. The proto file will contrinue to be updated to reflect any new keys, and gogo can automatically generate code to fill fields with tests data, and methods that can compare all fields before and after a roundtrip.
Several other tests and benchmarks have been added.