gh-117431: Improve performance of startswith and endswith #117782

eendebakpt · 2024-04-11T20:57:13Z

Improve performance of startswith and endswith by eliminating double work in tailmatch.

Benchmark results:

single-character match: x.startswith('a'): Mean +- std dev: [main_startswith] 80.6 ns +- 0.7 ns -> [v2_startswith] 69.8 ns +- 0.8 ns: 1.15x faster
single-character fail: x.startswith('q'): Mean +- std dev: [main_startswith] 67.8 ns +- 0.6 ns -> [v2_startswith] 68.4 ns +- 0.7 ns: 1.01x slower
two-character match: x.startswith('ab'): Mean +- std dev: [main_startswith] 81.3 ns +- 2.1 ns -> [v2_startswith] 75.9 ns +- 0.6 ns: 1.07x faster
two-character fail head: x.startswith('qb'): Mean +- std dev: [main_startswith] 68.0 ns +- 1.1 ns -> [v2_startswith] 76.0 ns +- 0.9 ns: 1.12x slower
two-character fail tail: x.startswith('aq'): Mean +- std dev: [main_startswith] 73.0 ns +- 3.3 ns -> [v2_startswith] 68.5 ns +- 0.9 ns: 1.07x faster
multi-character match: x.startswith('abcdefghijkl'): Mean +- std dev: [main_startswith] 84.9 ns +- 1.1 ns -> [v2_startswith] 78.7 ns +- 2.3 ns: 1.08x faster
multi-character fail midle: x.startswith('abcdef_hijkl'): Mean +- std dev: [main_startswith] 85.3 ns +- 2.0 ns -> [v2_startswith] 78.6 ns +- 0.8 ns: 1.09x faster

Benchmark hidden because not significant (3): empty: x.startswith(''), multi-character different kind match: xu.startswith('abcdefghijkl'), multi-character different kind fail: xu.startswith('abcdef_hijkl')

Geometric mean: 1.03x faster

Benchmark script

import pyperf
runner = pyperf.Runner()

setup="""
x = 'abcdefghijklmnop'
y = 'abcdefghijklmnop_bbbbbbbbbbbbbbbbbb'

xu = x + '\u1234'
yu = y + '\u1234'
l = 'a' * 1000 + 'b'
x_startswith = x.startswith
"""

# Tested with ./python sw.py --rigorous -o main_startswith.json

if 1:
    runner.timeit(name="empty: x.startswith('')", stmt="x.startswith(''); y.startswith('')", setup=setup)
    runner.timeit(name="single-character match: x.startswith('a')", stmt="x.startswith('a'); y.startswith('a')", setup=setup)
    runner.timeit(name="single-character fail: x.startswith('q')", stmt="x.startswith('q'); y.startswith('q')", setup=setup)
    
    runner.timeit(name="two-character match: x.startswith('ab')", stmt="x.startswith('ab'); y.startswith('ab')", setup=setup)
    runner.timeit(name="two-character fail head: x.startswith('qb')", stmt="x.startswith('qb'); y.startswith('qb')", setup=setup)
    runner.timeit(name="two-character fail tail: x.startswith('aq')", stmt="x.startswith('aq'); y.startswith('aq')", setup=setup)
    runner.timeit(name="multi-character match: x.startswith('abcdefghijkl')", stmt="x.startswith('abcdefghijkl'); y.startswith('abcdefghijkl')", setup=setup)
    runner.timeit(name="multi-character fail midle: x.startswith('abcdef_hijkl')", stmt="x.startswith('abcdef_hijkl'); y.startswith('abcdef_hijkl')", setup=setup)

    runner.timeit(name="multi-character different kind match: xu.startswith('abcdefghijkl')", stmt="xu.startswith('abcdefghijkl'); yu.startswith('abcdefghijkl')", setup=setup)
    runner.timeit(name="multi-character different kind fail: xu.startswith('abcdef_hijkl')", stmt="xu.startswith('abcdefghijkl'); yu.startswith('abcdefghijkl')", setup=setup)

By first checking the tail of the substring and then the start, we can combine a call to PyUnicode_READ and memcmp
For the single character case we prevent the check with PyUnicode_READ from happening twice.
With this PR the performance of many cases improves, in particular the case where substrings match . The only case where performance is less, is for substrings that fail to match on the start of the substring (but the performance for substrings that fail on the end of the substring improves).
Also see gh-117431: Optimize str.startswith #117480

Issue: Improve performance of startswith, endswith, count, *find, and *index methods for str, bytes and bytearray #117431

…atch

erlend-aasland

Quick-and-dirty initial review.

Objects/unicodeobject.c

Co-authored-by: Erlend E. Aasland <[email protected]>

…lmatch_v2

Misc/NEWS.d/next/Core and Builtins/2024-04-11-21-17-23.gh-issue-117431.ZxdAFN.rst

erlend-aasland · 2024-05-21T16:20:07Z

Do you still see the "two-character fail head" slowdown?

Misc/NEWS.d/next/Core and Builtins/2024-04-11-21-17-23.gh-issue-117431.ZxdAFN.rst

…e-117431.ZxdAFN.rst Co-authored-by: Erlend E. Aasland <[email protected]>

eendebakpt · 2024-05-21T20:58:51Z

Do you still see the "two-character fail head" slowdown?

Yes, and this is to be expected. For the two-character case we can either

i) Check the first character and then the last (e.g. the second character). This is the current implementation in main
ii) Check the last character and then the first. This happens in this PR.

The result is that main is faster for "abc".startswith("xb"), but this PR is faster for "abc".startswith("ax"). From an application point of view I am not sure whether one case is more important than the other. In the PR we trade performance for these two cases, but gain performance for several other (e.g. matching single character, matching multi character).

I have put some though into whether we could keep this PR more conservative (e.g. no cases with performance loss, but several cases with performance gain). The best I can create so far is main...eendebakpt:tailmatch_v3. There are some more branches and the memcmp does a bit more work than required, but just like this PR it should improve matching the single-character and two-character case.

eendebakpt added 2 commits April 11, 2024 21:40

Improve performance of startswith by eliminating double work in tailm…

035b3e2

…atch

code style

4f4b084

bedevere-app bot added the awaiting review label Apr 11, 2024

eendebakpt changed the title ~~gh-117480: Improve performance of startswith and endswith (version 2)~~ gh-117431: Improve performance of startswith and endswith (version 2) Apr 11, 2024

bedevere-app bot mentioned this pull request Apr 11, 2024

Improve performance of startswith, endswith, count, *find, and *index methods for str, bytes and bytearray #117431

Open

eendebakpt mentioned this pull request Apr 11, 2024

gh-117431: Optimize str.startswith #117480

Closed

📜🤖 Added by blurb_it.

9f201b1

erlend-aasland added the performance Performance or resource usage label Apr 11, 2024

lint

8792d0b

eendebakpt mentioned this pull request Apr 11, 2024

Speed up s.startswith() faster-cpython/ideas#671

Open

Merge branch 'main' into tailmatch_v2

0be010f

erlend-aasland reviewed May 18, 2024

View reviewed changes

Objects/unicodeobject.c Outdated Show resolved Hide resolved

Objects/unicodeobject.c Show resolved Hide resolved

Objects/unicodeobject.c Outdated Show resolved Hide resolved

eendebakpt and others added 4 commits May 20, 2024 23:04

Update Objects/unicodeobject.c

2a2cfb3

Co-authored-by: Erlend E. Aasland <[email protected]>

update comment

9f8e4b8

Merge branch 'tailmatch_v2' of github.com:eendebakpt/cpython into tai…

642d2bc

…lmatch_v2

Merge branch 'main' into tailmatch_v2

1693405

erlend-aasland reviewed May 21, 2024

View reviewed changes

Misc/NEWS.d/next/Core and Builtins/2024-04-11-21-17-23.gh-issue-117431.ZxdAFN.rst Outdated Show resolved Hide resolved

erlend-aasland changed the title ~~gh-117431: Improve performance of startswith and endswith (version 2)~~ gh-117431: Improve performance of startswith and endswith May 21, 2024

eendebakpt added 2 commits May 21, 2024 22:12

update news entry

8a7b9fe

Merge branch 'main' into tailmatch_v2

2fff994

erlend-aasland reviewed May 21, 2024

View reviewed changes

Misc/NEWS.d/next/Core and Builtins/2024-04-11-21-17-23.gh-issue-117431.ZxdAFN.rst Outdated Show resolved Hide resolved

Update Misc/NEWS.d/next/Core and Builtins/2024-04-11-21-17-23.gh-issu…

ed8b9d3

…e-117431.ZxdAFN.rst Co-authored-by: Erlend E. Aasland <[email protected]>

eendebakpt and others added 5 commits July 28, 2024 21:00

Merge branch 'main' into tailmatch_v2

378b586

Merge branch 'main' into tailmatch_v2

6bf1d0c

Merge branch 'main' into tailmatch_v2

8779d23

reduce churn

abe35e8

Merge branch 'main' into tailmatch_v2

febac50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-117431: Improve performance of startswith and endswith #117782

gh-117431: Improve performance of startswith and endswith #117782

eendebakpt commented Apr 11, 2024 •

edited

Loading

erlend-aasland left a comment

erlend-aasland commented May 21, 2024

eendebakpt commented May 21, 2024

gh-117431: Improve performance of startswith and endswith #117782

Are you sure you want to change the base?

gh-117431: Improve performance of startswith and endswith #117782

Conversation

eendebakpt commented Apr 11, 2024 • edited Loading

erlend-aasland left a comment

Choose a reason for hiding this comment

erlend-aasland commented May 21, 2024

eendebakpt commented May 21, 2024

eendebakpt commented Apr 11, 2024 •

edited

Loading