-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up resource lookup and fallback at runtime #2699
Comments
@Manishearth @robertbastian Can you weigh in and suggest alternatives? |
Let's do it.
This makes sense and I'd have expected this to happen anyway.
Do we actually have measurements confirming this as a bottleneck? One other improvement I can think of: many keys will have identical locale sets. Our providers could cache the last requested locale and index, and if the next key uses the same locales (as determined at datagen time), we already know its index. This would only help in non-fallback lookup though. For fallback we could cache the last key+locale+index, and if the next request is for the same key, we can start the search where we expected the last locale to be. As it's fallback, we usually look for a prefix of the previous key, so that should be in the same location. |
Discussion:
|
In #4207 I'm implementing a ZeroTrie variant, which, if the previous benchmarks hold, is faster and smaller than the current ZeroMap2d variant. However, ZeroHashMap is likely the fastest, but it will incur some data size cost. So I'm keeping this issue open but lowering the priority. |
Raising the priority again because the ZeroTrie optimization was only applied to blob provider. We should still explore a solution for baked provider. This performance impact manifests itself in NeoDateTimeFormatter, which hits the provider more than the previous iteration of DateTimeFormatter. |
An issue I'm encountering now is that in datetime format, the ZeroTrie lookup tables for skeleta are fairly large. If we were able to share strings between keys (calendars), they could be made smaller. For example, with one level of indirection, we could store a table of locale strings, and then the lookup table would have a |
That would be cool. I've wanted to amortize space between aux keys by storing them at a different layer in blob |
Just an idea for how to do this in Blob Provider:
I think this should amortize the cost of the aux-key-heavy data keys quite nicely. |
(2.) is using |
I did some experimentation. For an export of all date skeleton keys over all locales and maximal deduplication, here's what I get:
These figures aren't compelling enough for me to continue work on this blob layout. I archived my code here: |
#2683 and #834, as well as #2686, are largely centered around optimizations that are performed at datagen time.
We may want to consider opportunities for runtime improvements as well. Ideas include:
Use hash maps: The lowest-hanging fruit here would be to leverage ZeroHashMap to make resource lookups more constant-time, especially if you're in a situation where you ship hundreds of locales in a single data file. We should at least do this in BakedDataProvider since we can more easily change the data model post 1.0 and if people are building hundreds of locales, they probably should be using BakedDataProvider anyway.
Micro-optimizations in the fallback algorithm: For example, we can probably try the input DataLocale directly in the resource file and return fast if there is an exact match before performing the normalization and fallback steps.
Micro-optimizations in cmp_bytes (aka strict_cmp): Another bottleneck in resource lookup is the cmp_bytes function in DataLocale. There may be opportunities to improve the performance of this operation.
I'll also point out that one may naively believe that hitting the data provider multiple times for each constructor (e.g., loading date symbols, time symbols, and number symbols independently) has a negative performance cost. However, I would point to previous evidence that by leveraging smaller keys, we save on deserialization/validation cost, and also that smaller keys typically result in smaller data files due to more deduplication opportunities (although whenever a key split is proposed, the size impact should be measured).
The text was updated successfully, but these errors were encountered: