Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of spliceChangesIntoString #312

Merged
merged 6 commits into from
Aug 15, 2024
Merged

Conversation

thecrypticace
Copy link
Contributor

@thecrypticace thecrypticace commented Aug 13, 2024

The original implementation built strings in reverse because it was logically simpler to reason about however it resulted in a significant performance and memory usage impact on larger strings with lots of changes. This could be seen in Svelte, Liquid, and Angular.

This implementation is based on #311 (thanks @ABuffSeagull) with some additional tweaks which increase performance further. I ran some benchmarks on my M3 Max (12p / 4e) and tested strings of various sizes with a different number of changes:

  • a small string: 220 bytes with 2 changes
  • a medium string: 4.4KB with 5 changes
  • a large string: 44KB with 50 changes
  • an extra large string: 4.4MB with 500 changes
  • an extra large string with a large number of changes: 4.4MB with 5,000 changes

While we're not likely to run into cases with extra large strings I felt it was still useful to analyze the performance impact in extreme cases so we can be confident that this is not a performance bottleneck for any file (or expression) we're likely to encounter.

Before this PR:

  name                                       hz       min       max      mean       p75       p99      p995      p999     rme  samples
· small string                     7,001,413.96    0.0000    0.5530    0.0001    0.0001    0.0002    0.0002    0.0003  ±0.51%  3500707
· medium string                      694,515.66    0.0011    0.1906    0.0014    0.0013    0.0018    0.0020    0.0685  ±0.82%   347258
· large string                         9,490.75    0.0776    0.1989    0.1054    0.1488    0.1647    0.1677    0.1751  ±0.84%     4746
· extra large string                     2.8841    339.10    356.18    346.73    352.49    356.18    356.18    356.18  ±1.26%       10
· extra large string (5k changes)        0.2724  3,560.79  3,812.77  3,671.26  3,689.60  3,812.77  3,812.77  3,812.77  ±1.26%       10

After this PR:

  name                                       hz     min     max    mean     p75     p99    p995    p999     rme  samples
· small string                     8,163,442.24  0.0000  0.5489  0.0001  0.0001  0.0002  0.0002  0.0003  ±0.23%  4081722
· medium string                    4,052,634.65  0.0002  0.2099  0.0002  0.0003  0.0003  0.0004  0.0005  ±0.54%  2026439
· large string                       545,278.66  0.0016  0.1896  0.0018  0.0018  0.0022  0.0023  0.0077  ±0.46%   272640
· extra large string                  57,464.22  0.0155  0.2262  0.0174  0.0168  0.0218  0.1151  0.1345  ±0.54%    28733
· extra large string (5k changes)      5,818.68  0.1508  0.3839  0.1719  0.1632  0.3294  0.3367  0.3700  ±0.76%     2910

This amounts to roughly a:

  • ~1.2x speed up for small strings
  • ~5.8x speed up for medium strings
  • ~57.5x speed up for large strings
  • ~20,000x speed up for extra large strings (yes, really)

So, in the old implementation, longer and longer strings had a much bigger perf impact.

Additionally, this should reduce memory usage by quite a bit but I don't have concrete numbers for that. (Just some small random testing which confirmed that usage was lower)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants