Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: substantially improve avro deserialization performance #6201

Merged
merged 1 commit into from
Sep 15, 2020

Conversation

agavra
Copy link
Contributor

@agavra agavra commented Sep 14, 2020

Description

fixes #6189
fixes #6190

Wooohoo!! I was able to figure out a way to get this gain without introducing a cache 😂 that means we get even better performance than what I suggested in #6106 because we don't need to hash Schema objects as cache keys.


Previously, were were creating a case insensitive field map per-record, and always calling toUpperCase on every field - even if the case-sensitive version would have been matched. Looking into the performance profile (detailed in #6106) it revealed that these two points together where taking up a large chunk of time.

There are two schemas in play: the ksql schema and the value schema (taken from schema registry). The original code created a mapping from expected field name (case-insensitive) to the value field. Then, for each ksql field, it looked in this map by name to see if it could find a match. If it did, it would place the value from that field into the output record.

The new code reverses this lookup pattern: it looks for each field in the actual record to see if there's a match in the ksql schema. If there isn't, it checks if the upper-cased version of the field name exists in the ksql record.

This improves performance on two fronts:

  1. it no longer creates the case-insensitive mapping per-record deserialized. Instead, it leverages the existing ksql schema as the mapping
  2. it no longer calls toUpperCase on fields if the case-sensitive version matches. This allows us to get further improved performance in the case-sensitive path (this is key because we used to spend 23% of our time in deserialization just calling String.toUpperCase - this trades this off for an extra map lookup, which is usually free due to string interning of the hash code)

Testing done

Running the benchmark locally shows improvements for the metrics schema (the gain is less impressive on the smaller impressions schema) by about 40%:

Benchmark                      (schemaName)  (serializationFormat)  Mode  Cnt  Score   Error  Units
SerdeBenchmark.deserialize          metrics                   Avro  avgt    3  8.294 ± 0.993  us/op
SerdeBenchmark.deserializeNew       metrics                   Avro  avgt    3  4.722 ± 0.220  us/op

Profiling results now show fewer easy-to-fix hot spots. Before the optimization, we spent nearly twice the amount of time in AvroDataTranslator compared to actually deserializing the avro:

image

Afterwards (funnily, the percentages exactly flipped!) we spend half the time in AvroDataTranslator as we do deserializing the data:

image

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

@agavra agavra requested a review from a team as a code owner September 14, 2020 23:31
Copy link
Member

@vpapavas vpapavas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent analysis @agavra ! LGTM!

@agavra agavra merged commit 325a009 into confluentinc:master Sep 15, 2020
@agavra agavra deleted the improve_cdt branch September 15, 2020 23:50
@big-andy-coates
Copy link
Contributor

Nice one @agavra.

We're still paying the price of converting through two Connect schemas though right? Did you think we'd gain much by dropping the Connect dependencies and coding this directly as ksqlDB types?

@agavra
Copy link
Contributor Author

agavra commented Sep 21, 2020

@big-andy-coates I'm not entirely sure what you mean by "converting through two Connect schemas". If what you mean is converting the Avro directly into a ksql-compatible memory representation (avoiding use of AvroConverter altogether) then I think there's a ton of optimization potential there (on the order of an additional 40-50% improvement). The investment for that is much higher though 😕

You can see the full breakdown of costs in the flame chart on #6106

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants