Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-122288: Improve performances of fnmatch.translate #122289

Merged
merged 18 commits into from
Nov 27, 2024

Conversation

picnixz
Copy link
Member

@picnixz picnixz commented Jul 25, 2024

This is a smaller PR compared to the one for the C implementation and is probably easier to review. Below are the benchmarks reported on the issue:

+------------------------------------------+-----------------------+-----------------------+
| Benchmark                                | fnmatch-translate-ref | fnmatch-translate-py  |
+==========================================+=======================+=======================+
| abc/[!]b-ac-z9-1]/def/\*?/*/**c/?*[][!]  | 6.09 us               | 3.99 us: 1.53x faster |
+------------------------------------------+-----------------------+-----------------------+
| !abc/[!]b-ac-z9-1]/def/\*?/*/**c/?*[][!] | 6.39 us               | 4.07 us: 1.57x faster |
+------------------------------------------+-----------------------+-----------------------+
| a**?**cd**?**??k***                      | 2.24 us               | 1.51 us: 1.49x faster |
+------------------------------------------+-----------------------+-----------------------+
| a/**/b/**/c                              | 1.97 us               | 1.12 us: 1.76x faster |
+------------------------------------------+-----------------------+-----------------------+
| man/man1/bash.1                          | 3.00 us               | 1.21 us: 2.48x faster |
+------------------------------------------+-----------------------+-----------------------+
| a*b*c*d*e*f*g*h*i*j*k*l*m*n*o**          | 5.40 us               | 3.33 us: 1.62x faster |
+------------------------------------------+-----------------------+-----------------------+
| Geometric mean                           | (ref)                 | 1.71x faster          |
+------------------------------------------+-----------------------+-----------------------+

@picnixz picnixz added performance Performance or resource usage stdlib Python modules in the Lib dir labels Jul 25, 2024
@barneygale barneygale self-requested a review July 27, 2024 16:12
@picnixz
Copy link
Member Author

picnixz commented Aug 14, 2024

@barneygale Friendly ping in case you forgot you were interested in this PR.

Lib/fnmatch.py Outdated Show resolved Hide resolved
@barneygale
Copy link
Contributor

Could you add some timings to the PR description please?

Copy link
Contributor

@barneygale barneygale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice improvement :)

Lib/fnmatch.py Outdated Show resolved Hide resolved
Lib/fnmatch.py Show resolved Hide resolved
Lib/fnmatch.py Outdated Show resolved Hide resolved
Lib/fnmatch.py Outdated Show resolved Hide resolved
Lib/fnmatch.py Outdated Show resolved Hide resolved
Lib/test/test_fnmatch.py Outdated Show resolved Hide resolved
Lib/fnmatch.py Outdated Show resolved Hide resolved
Lib/fnmatch.py Outdated Show resolved Hide resolved
Lib/fnmatch.py Show resolved Hide resolved
@dg-pb
Copy link
Contributor

dg-pb commented Aug 23, 2024

Small optimization possibility - 1% for 1 pattern. Not sure about the others. Maybe slightly more readable too.

REP1 = repeat('\\')
REP2 = repeat(r'\\')
REP3 = repeat('-')
REP4 = repeat(r'\-')
_replace = str.replace

...

stuff = map(_replace, chunks, REP1, REP2)
stuff = map(_replace, stuff, REP3, REP4)
stuff = '-'.join(stuff)

And cache size. Not sure what it should be, are any rules for such?

Apart from this, LGTM.

@picnixz
Copy link
Member Author

picnixz commented Aug 23, 2024

And cache size. Not sure what it should be, are any rules for such?

I kept the same cache size. For re.escape we could have a smaller cache though if we are assuming US glyphs. We could have a cache of 512 to handle latin-1 + special characters from foreign languages (next power after 256). What do you think? I put 32k but it's probably an overkill and we could probably be safe under 4096 different glyphs, even in foreign languages such as Chinese characters.

@dg-pb
Copy link
Contributor

dg-pb commented Aug 23, 2024

My thinking is as follows.

With 32K size, the max size of this application (with assumption that average string size is 20 characters) is:

In [39]: sys.getsizeof('a' * 20) * 2 * 32000 / 1024**2
Out[39]: 3.7   # MB

It is not much for standard machine, but is Python used on some micro platforms where that might be big? I have little experience with such and 4MB might not be an issue at all, but I would say it is worth looking into it.

I think the easiest would be to find other use cases of lru_cache in standard library and see what is maximum size limit for those. If those exist, then this could be answered easily without much effort.

Otherwise, if you can't find anything maybe someone else has any insights?

As a last resort, could make a conservative choice. My python starts with 5MB initial memory consumption. Take 5% of it, then, 2048 size would be that.

@picnixz
Copy link
Member Author

picnixz commented Aug 23, 2024

I think the easiest would be to find other use cases of lru_cache in standard library and see what is maximum size limit for those. If those exist, then this could be answered easily without much effort.

fnmatch._compile_pattern takes 32k as a cache size. It is also documented as such. For re.escape, the cache could be much smaller (see my comment on the glyphs) so I think taking 2048/4096 would be enough.

but is Python used on some micro platforms where that might be big?

For microcontrollers, there is MicroPython and only a subset of Python is actually available. 4MB can be quite big but I don't think they would implement LRU cache that size.

@dg-pb
Copy link
Contributor

dg-pb commented Aug 23, 2024

fnmatch._compile_pattern takes 32k as a cache size.

In this case, 32K seems fine to me. Just follow the standard and if there are any issues in the future, these can be handled as a pair.

@picnixz
Copy link
Member Author

picnixz commented Aug 27, 2024

@barneygale This one is the (only) remaining fnmatch PR that I decided to keep. I'd appreciate your opinion on the size of the cache for re.escape (I think we could live with maxsize=4096 instead of 32k).

Lib/fnmatch.py Outdated Show resolved Hide resolved
Lib/fnmatch.py Outdated Show resolved Hide resolved
Lib/fnmatch.py Outdated Show resolved Hide resolved
@picnixz picnixz force-pushed the fnmatch-py-implementation branch from a205e6b to 0518912 Compare August 28, 2024 09:44
The rationale for this change is as follows:

re.escape() is only used to cache single Unicode characters
in shell patterns; we may heuristically assume that they are
ISO-8859-1 encodable, thereby requiring a cache of size 256.
To allow non-traditional glyphs (or alphabets with a small
number of common glyphs), we double the cache size.
@picnixz picnixz force-pushed the fnmatch-py-implementation branch from 0518912 to bb6c3ee Compare August 28, 2024 10:08
@picnixz picnixz requested a review from barneygale October 3, 2024 18:58
@picnixz
Copy link
Member Author

picnixz commented Oct 14, 2024

@barneygale friendly ping in case you forgot about this PR!

Copy link
Contributor

@barneygale barneygale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice work :)

Lib/fnmatch.py Outdated Show resolved Hide resolved
Co-authored-by: Barney Gale <[email protected]>
@picnixz
Copy link
Member Author

picnixz commented Oct 18, 2024

I'm committing from my phone so hopefully it'll be fine. Otherwise I'll have a look at my return next week!

Lib/test/test_fnmatch.py Outdated Show resolved Hide resolved
@picnixz
Copy link
Member Author

picnixz commented Oct 22, 2024

@barneygale I've also updated the tests just to remember what the indices were so it should be ready to merge now (you can take a last look since I've stared waaaay to much at this PR).

@picnixz
Copy link
Member Author

picnixz commented Nov 27, 2024

Friendly ping for this PR @barneygale Maybe you want to merge it soon?

@barneygale
Copy link
Contributor

Very sorry @picnixz, I've been holding off because I was hoping to get #116392 in first, which would indirectly increase the test coverage of fnmatch.translate(). I'll take another look today.

@picnixz
Copy link
Member Author

picnixz commented Nov 27, 2024

Oh I see! no worries (now I know why you didn't automerge it).

@barneygale barneygale merged commit 78cb377 into python:main Nov 27, 2024
36 checks passed
@picnixz picnixz deleted the fnmatch-py-implementation branch November 27, 2024 16:46
@barneygale
Copy link
Contributor

Sorry again for taking ages, thanks for the improvement :)

ebonnal pushed a commit to ebonnal/cpython that referenced this pull request Jan 12, 2025
…122289)

Improve performance of this function by a factor of 1.7x.

Co-authored-by: Barney Gale <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage stdlib Python modules in the Lib dir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants