Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy variants of isInfixOf and breakSubstring #30

Open
ryo1kato opened this issue Sep 13, 2014 · 10 comments
Open

Lazy variants of isInfixOf and breakSubstring #30

ryo1kato opened this issue Sep 13, 2014 · 10 comments

Comments

@ryo1kato
Copy link

isSuffixOf available in Data.ByteString.Lazy, but not in Data.ByteString.Lazy.Char8 (it's commented out). Likewise isInfixOf is not available Data.ByteString.Lazy.

Similarly, there's no lazy version of breakSubstring

Is there any reason for this?

@dcoutts
Copy link
Contributor

dcoutts commented Dec 14, 2014

Exported isSuffixOf.

We lack implementations of isInfixOf and breakSubstring. We should probably get them from the stringsearch package.

@sjakobi sjakobi changed the title isSuffixOf is not exported in Data.ByteString.Lazy.Char8 Lazy variants of isInfixOf and breakSubstring Jul 4, 2020
@Bodigrim
Copy link
Contributor

Bodigrim commented Oct 5, 2020

@subterfugue
Copy link

What's the blocker here? I've just ran into this.

@Bodigrim
Copy link
Contributor

@ribosomerocker No blockers, PRs welcome.

@subterfugue
Copy link

Am I correct in assuming that this depends on #307 ?
This would need to have a breakSubstring implementation used in the lazy substring module. Is it fine to continue with the current implementation?

@Bodigrim
Copy link
Contributor

Not necessarily: it's likely better to have at least some (non-pathological) breakSubstring for lazy ByteString than none at all. Further refinements can be done separately.

@Equwece
Copy link

Equwece commented Aug 2, 2023

Hi, I am trying to add a lazy variant of breakSubstring from stringsearch to solve #100 and need an advice. The code from the stringsearch actively uses (1, 2) array package, but this package is not included in bytestring's build dependencies, what can be the preferred way to solve this problem (I suppose that adding new build dependency is not an option)?

@Bodigrim
Copy link
Contributor

Bodigrim commented Aug 2, 2023

@Equwece you can use https://hackage.haskell.org/package/base-4.18.0.0/docs/Data-Array-Byte.html and low-level operations like https://hackage.haskell.org/package/base-4.18.0.0/docs/GHC-Exts.html#v:readIntArray-35-. You might end up implementing a small module with helper functions similar in spirit to https://hackage.haskell.org/package/text-2.0.2/docs/Data-Text-Array.html.

That said, I'd prototype with array package. If it works out well, you'll be able to migrate to low-level primitives later.

@Bodigrim
Copy link
Contributor

I've been recently hit by the absense of lazy breakSubstring again.

A lazy way to implement breakSubstring for lazy ByteString is to adopt https://github.com/haskell/text/blob/master/src/Data/Text/Internal/Lazy/Search.hs. The algorithm in text has been well tested and well benchmarked, so it's a safe bet, given that it seems no one has much time to dedicate to this issue. @clyring what do you think?

@Equwece
Copy link

Equwece commented Sep 21, 2023

Sorry for the lack of progress on the task, I have implemented a lazy variant of breakSubstring using a self-written BoyerMoore implementation and byte array primitives similar to the text module (really appreciate @Bodigrim answer), but a quick benchmarking showed that my solution is much slower than stringsearch's solution, so I decided not to send pr yet and improve the implementation, but unfortunately I haven't found the time yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants