Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perfs: optimize serialize_string_value #61

Merged
merged 3 commits into from
Jan 15, 2025

Conversation

MatthieuBizien
Copy link
Contributor

I have found serialize_string_value to be a bottleneck in my code, and this was fixed by hotpatching and replacing it to use str.replace instead of "".join.

Benchmark:

def slow_serialize_string_value(value):
    return ''.join(
        r'\"' if c == '"' else
        r'\\' if c == '\\' else
        r'\A ' if c == '\n' else
        r'\D ' if c == '\r' else
        r'\C ' if c == '\f' else
        c
        for c in value
    )

_replacement_string_value = {
        '"': r"\"",
        "\\": r"\\",
        "\n": r"\A ",
        "\r": r"\D ",
        "\f": r"\C ",
    }
_re_string_value = "".join(re.escape(e) for e in _replacement_string_value.keys())
_re_string_value = re.compile("["+ _re_string_value + "]", re.MULTILINE )
def _serialize_string_value_match(match):
    return _replacement_string_value[match.group(0)]
def fast_serialize_string_value(value):
    return _re_string_value.sub(_serialize_string_value_match, value)

string = "".join(chr(random.randint(0, 255)) for _ in range(10_000_000))
small_string = "".join(chr(random.randint(0, 255)) for _ in range(1000))

%timeit fast_serialize_string_value(string)
# 75.7 ms ± 178 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit slow_serialize_string_value(string)
# 954 ms ± 5.04 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit slow_serialize_string_value(string)
# 97 μs ± 1.22 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

%timeit fast_serialize_string_value(small_string)
# 6.8 μs ± 50.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

@liZe
Copy link
Member

liZe commented Jan 15, 2025

Thanks a lot! I’ll fix the linting problems (unrelated to your code) and then merge your pull request.

@liZe liZe merged commit b06111d into Kozea:main Jan 15, 2025
0 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants