Add a feature to cache the most recent string index->cursor result #793

dpk · 2021-12-10T20:34:37Z

~~Not ready to merge yet. Needs more testing and benchmarks.~~

This is lighter-weight than building a full index->cursor table for the string, adding a constant two words to the memory required to store a string, as opposed to one word for every n characters. The cached cursor is used for any string-ref operation requesting an index after the most-recently-requested index, making potentially quadratic repeated string-ref procedures run in linear time. In theory, it could also use a heuristic to speed up moving backwards through the string when it thinks that moving the old cursor backwards would be faster than starting again at the start of the string. In practice, my logging of when the cached cursor is actually reused during the Chibi compilation and startup process shows that the most common case of moving backwards is going back to the start of the string anyway. (EDIT: Oh, but there are cases like string-index-right where an efficient implementation would want to move backwards character-by-character from the end of the string.)

I’m unsure how well this interacts with threading in Chibi. As I understand it, Chibi’s green threads will never interrupt a C function, so there should be no races regards getting and setting the cache. If multiple threads are accessing the same string at the same time, they’ll trample on each others’ use of the cursor — but it should still be quicker, in growth terms, for both threads than would be just starting at cursor/index 0 each time.

Benchmarks to follow. It remains to be seen if the cost of checking and setting the cursor cache actually improves performance overall, especially considering the impact on small strings. If results are favourable, I think it might even be reasonable to enable this option by default.

Input, especially code review, appreciated.

This is lighter-weight than building a full index->cursor table for the string, adding a constant two words to the memory required to store a string, as opposed to one word for every n characters. The cached cursor is used for any string-ref operation requesting an index after the most-recently-requested index, making potentially quadratic repeated string-ref procedures run in linear time. In theory, it could also use a heuristic to speed up moving backwards through the string when it thinks that moving the old cursor backwards would be faster than starting again at the start of the string. In practice, my logging of when the cached cursor is actually reused during the Chibi compilation and startup process shows that the most common case of moving backwards is going back to the start of the string anyway. Benchmarks to follow.

dpk · 2021-12-10T22:24:09Z

Preliminary (albeit likely unfairly flattering) benchmarks show that this patch does actually marginally speed up access to even very small strings. With the default SEXP_STRING_INDEX_TABLE_CHUNK_SIZE of 64, the results are most noticeable on strings slightly longer than 76 characters, the minimum to trigger Chibi into generating the offset table at all. With these strings SEXP_USE_STRING_INDEX_TABLE has a notable negative performance impact, whereas the new SEXP_USE_STRING_REF_CACHE is slightly faster (by an insignificant amount, possibly a sampling error). I’ve tested with strings in English (ASCII-only), German (mostly ASCII), Ancient Greek (mostly 2 bytes per character in UTF-8), and Sanskrit (mostly three bytes per character in UTF-8). I’ll post my benchmark code when I’ve tested longer strings.

Results on tiny strings (between 5 and 12 characters long):

vanilla          36.84 real        36.33 user         0.18 sys
string-index     37.72 real        36.52 user         0.15 sys
string-ref-cache 35.29 real        34.93 user         0.12 sys

Results on small strings (all just over 76 characters):

vanilla          105.69 real       104.41 user         0.52 sys
string-index     111.58 real       110.23 user         0.49 sys
string-ref-cache 104.66 real       103.59 user         0.40 sys

Mostly in the weeds so far, though it’s encouraging to see that it doesn’t seem to have any negative time impact. (Practical memory impact is yet to be tested.) As remarked in features.h, the big improvements, if any, will come with very large strings.

ashinn · 2021-12-10T22:43:09Z

I'm OK with even a small hit in string-ref performance if it helps some common quadratic patterns. This doesn't slow down the cursor API or any of the high-level APIs built on cursors.

The only real concern is the memory overhead. Chibi assumes 4-word aligned heap objects, so your change increases the current 4-word string wrapper to 8 words (not counting the underlying byte vector).

The other minor concern is this might encourage people to think string-ref is fast for all cases in Chibi and they don't need cursors, when in fact random access would still be slow.

dpk · 2021-12-11T00:17:36Z

I'm OK with even a small hit in string-ref performance if it helps some common quadratic patterns. This doesn't slow down the cursor API or any of the high-level APIs built on cursors.

Okay, good to know. So far, there doesn’t appear to be a speed performance hit, as you can see above. I’ll have to see what effect changing it to be able to optimize string accesses backwards as well as forwards has.

It just occurred to me that the same cache can be used to speed up sexp_string_cursor_to_index as well as sexp_string_index_to_cursor. I’ll also see about extending my patch to do that. That should also not slow down doing things the pure string-cursor way: the APIs which only use cursors seemingly leave sexp_string_cursor_to_index almost entirely untouched.

The only real concern is the memory overhead. Chibi assumes 4-word aligned heap objects, so your change increases the current 4-word string wrapper to 8 words (not counting the underlying byte vector).

Okay, good to know. How should I measure memory usage? SEXP_USE_DEBUG_GC or external tools or both? If the former, what figures should I look at particularly? The speed tests above are from Mac OS’s /usr/bin/time, which can’t measure memory usage, but I’ve installed GNU time which can.

The other minor concern is this might encourage people to think string-ref is fast for all cases in Chibi and they don't need cursors, when in fact random access would still be slow.

How common is truly random access to strings? I’m doubtful that it’s common enough that this need be a concern. The only pseudo-random access pattern I can think of is the one I mentioned above, of two threads simultaneously trying to process the same string. That could be a problem. Certainly people should still be encouraged to use the string cursor APIs when possible, though.

ashinn · 2021-12-11T00:53:21Z

You can check the chibi/red/snow image sizes for a start.

ashinn · 2021-12-17T02:28:35Z

Regarding non-linear usage patterns, backtracking regexp and parsing libraries fall into this case, as well as Boyer-Moore (though that tends to be more localized). Even non-backtracking parsers will tend to record submatch indexes to lazily generate substrings, and the caller may extract the substrings in an arbitrary order.

More pathological cases are a space optimized string set or dictionary which concatenates many strings into a single string, storing only the start offsets externally (e.g. in a compact uvector). The entries may be accessed entirely randomly.

dpk · 2021-12-27T18:15:27Z

You can check the chibi/red/snow image sizes for a start.

For me, they’re all exactly the same size as they are when built without this patch.

dpk · 2021-12-28T20:22:08Z

I’ve possibly discovered a bug in SEXP_USE_STRING_INDEX_TABLE while writing benchmarks. It seems that sometimes the index table can possibly sometimes contain invalid cursors (pointing to partial UTF-8 bytes) — at least that’s the only explanation I can think of for a program which only uses string-ref (no string-set!, no direct use of cursors) coming out with ERROR in next-char on line 25 of file benchmarks/strings/02-random-char-access-file.scm: string-ref: invalid utf8 byte: {String-Cursor #15 5184}. Will investigate.

dpk · 2022-02-10T18:36:46Z

Still haven’t had time to finish working out some practical benchmarks for this, but coincidentally I did just learn that Emacs uses basically the same trick.

(It doesn’t appear to be officially used in the Emacs source, but the name ‘bookmark’ for this is actually pretty good. Much less awkward than ‘string-ref cache’)

ashinn · 2022-02-11T04:47:09Z

Thanks for the link Daphne, that's an interesting article. Obviously I prefer the Julia/Go approach :)

[skip ci]

dpk · 2022-03-03T17:59:33Z

Chicken Scheme is also now planning to switch to doing this.

I’ve merged in a branch which lets the bookmark move backwards as well as forwards through the string. I had intended to benchmark that separately, since moving backwards is slower than moving forwards for a few reasons, but in practice the factors involved don’t seem to be large enough for this to be a concern — there‘s still generally a speedup compared to starting from the beginning of the string.

I’m marking this ready for merge, but I’d appreciate a code review. In particular, I’m not sure if my change to the sexp_struct for strings requires a change in _sexp_type_specs — that code is just completely opaque to me. (Could this be why the image sizes weren’t affected?)

dpk · 2022-09-26T08:35:53Z

Apologies for the bump, but I had someone recently asking when this would be merged. Could you take a look at the remaining issue with _sexp_type_specs?

ashinn · 2022-10-01T12:51:06Z

Apologies for the delay, I lost track of this. You don't need to touch _sexp_type_specs in this case. I'll take a final look shortly before merging.

ashinn · 2022-10-19T08:59:12Z

LGTM, sorry for the delay.

dpk added 2 commits December 11, 2021 10:44

Use the cursor cache to speed up string-cursor->index as well

4e0f10a

Make the string cursor able to run backwards as well as forwards

a746370

Merge branch 'master' into string-ref-cache

9419fb1

Merge branch 'string-ref-cache' into string-ref-cache-backwards

9dcda90

dpk marked this pull request as draft January 23, 2022 19:49

Indentation fix

476ae19

[skip ci]

dpk marked this pull request as ready for review March 3, 2022 18:00

ashinn merged commit d67fa42 into ashinn:master Oct 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a feature to cache the most recent string index->cursor result #793

Add a feature to cache the most recent string index->cursor result #793

dpk commented Dec 10, 2021 •

edited

Loading

dpk commented Dec 10, 2021

ashinn commented Dec 10, 2021

dpk commented Dec 11, 2021

ashinn commented Dec 11, 2021

ashinn commented Dec 17, 2021

dpk commented Dec 27, 2021

dpk commented Dec 28, 2021

dpk commented Feb 10, 2022 •

edited

Loading

ashinn commented Feb 11, 2022

dpk commented Mar 3, 2022 •

edited

Loading

dpk commented Sep 26, 2022

ashinn commented Oct 1, 2022

ashinn commented Oct 19, 2022

Add a feature to cache the most recent string index->cursor result #793

Add a feature to cache the most recent string index->cursor result #793

Conversation

dpk commented Dec 10, 2021 • edited Loading

dpk commented Dec 10, 2021

ashinn commented Dec 10, 2021

dpk commented Dec 11, 2021

ashinn commented Dec 11, 2021

ashinn commented Dec 17, 2021

dpk commented Dec 27, 2021

dpk commented Dec 28, 2021

dpk commented Feb 10, 2022 • edited Loading

ashinn commented Feb 11, 2022

dpk commented Mar 3, 2022 • edited Loading

dpk commented Sep 26, 2022

ashinn commented Oct 1, 2022

ashinn commented Oct 19, 2022

dpk commented Dec 10, 2021 •

edited

Loading

dpk commented Feb 10, 2022 •

edited

Loading

dpk commented Mar 3, 2022 •

edited

Loading