Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

elemIndexEnd non-optimally implemented in terms of findIndexEnd #278

Open
archaephyrryx opened this issue Aug 30, 2020 · 1 comment
Open

Comments

@archaephyrryx
Copy link
Contributor

The current implementations of the elemIndexEnd function in modules Data.ByteString and Data.ByteString.Lazy are uniformly defined as findIndexEnd . (==) (the Char8 version merely invokes the strict bytestring definition after converting from Char to Word8).

This seems counterintuitive when compared with elemIndex, which is more optimized than findIndex through the use of the memchr C FFI call to avoid costly byte-by-byte predicate testing.

There exists a GNU extension for string.h that defines an operation memrchr that performs a similar operation to memchr but returns the final occurrence of a byte rather than the first, which could be used when available. Even without such platform-specific optimizations, it should still be possible to either add a memrchr-like function to the cbits code.

Even without FFI calls, elemIndexEnd could use the same logic as findIndexEnd and perform a byte-by-byte direct equality test at least as efficiently as findIndexEnd . (==), but without the indirection.

@Bodigrim
Copy link
Contributor

Bodigrim commented Sep 5, 2020

The implementation of elemIndexEnd was recently discussed here: #155 (comment) It seems that thanks to inlining there is no performance penalty.

AFAIU Windows systems do not provide memrchr, so the only option to remain cross-platform is to implement memrchr in cbits. I do not mind against such approach, if supported by benchmarks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants