Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to skip dictionary compilation #196

Open
joapuiib opened this issue Nov 8, 2024 · 7 comments
Open

Add option to skip dictionary compilation #196

joapuiib opened this issue Nov 8, 2024 · 7 comments
Labels
S: triage Issue needs triage.

Comments

@joapuiib
Copy link

joapuiib commented Nov 8, 2024

I've found that compiling the dictionary is the most heavy task in my setup, and I'd like to be able to skip this part if the dictionary is already compiled and the custom dictionary has not changed.

I can control this conditions in a bash script, but I need an option in pyspelling to skip this process.

@gir-bot gir-bot added the S: triage Issue needs triage. label Nov 8, 2024
@facelessuser
Copy link
Owner

Thanks, I'll look into this once I have time. Laptop just died, so it may be a bit

@facelessuser
Copy link
Owner

Had to get a new laptop. I think I'm sort of functional again. So, I assume you are talking about the compiling of your custom word lists.

I imagine we can just hash the files and write some dictionary cache file. We'll still have to read the files to hash them, but maybe that will end up being faster than also compiling them.

@joapuiib
Copy link
Author

joapuiib commented Nov 11, 2024

Yes, I'm talking about compiling custom word lists.

I was thinking of a more basic approach with a toggle argument in the CLI tool, like --skip-dictionary-compilation.

Then, I can check in a bash script if I want to skip this process or not. That check could be something simple or complex (some sort of cache or just looking at modification timestamp), but pyspelling wouldn't have to care about that.

@facelessuser
Copy link
Owner

If I recall, you are using Hunspell?

I think Hunspell doesn't actually compile a dictionary as much as just splice the various dictionaries together and copies them to the output location, only Aspell does an actual compile (assuming I remember correctly).

If I add a skip method, I need to at least check if the output dictionary exists. If it exists and skip is enabled, we could ignore compilation. Let me think about this. If we go this route, I imagine it would be pretty easy to implement.

As a side note, modified timestamps are not always reliable, depending on the system.

@joapuiib
Copy link
Author

Yes, I'm using Hunspell.

If I add a skip method, I need to at least check if the output dictionary exists. If it exists and skip is enabled, we could ignore compilation. Let me think about this. If we go this route, I imagine it would be pretty easy to implement.

Sure, that is a reasonable check.

@joapuiib
Copy link
Author

joapuiib commented Nov 11, 2024

I imagine we can just hash the files and write some dictionary cache file. We'll still have to read the files to hash them, but maybe that will end up being faster than also compiling them.

I also think that hashing the files would be a nice solution, but I don't know how complex this check would be and if a justified speed-up is achieved with this approach.

@facelessuser
Copy link
Owner

I also think that hashing the files would be a nice solution, but I don't know how complex this check would be and if a justified speed-up is achieved with this approach.

I don't think it would provide a speedup in certain cases, but it is the more reliable approach. The best way is probably just to check if exists and skip if requested to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S: triage Issue needs triage.
Projects
None yet
Development

No branches or pull requests

3 participants