Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Forbidden words with hyphens are treated differently #6582

Closed
1 task done
PeterJCLaw opened this issue Nov 25, 2024 · 7 comments · Fixed by #6608
Closed
1 task done

[Bug]: Forbidden words with hyphens are treated differently #6582

PeterJCLaw opened this issue Nov 25, 2024 · 7 comments · Fixed by #6608
Labels

Comments

@PeterJCLaw
Copy link

Kind of Issue

Runtime - command-line tools

Tool or Library

cspell

Version

6.31.1, 8.16.0

Supporting Library

Not sure / None

OS

Linux

OS Version

Ubuntu 22.04.5 LTS

Description

Given the sentence:

Cows don't like flip-flops.

I would expect that forbidding both like and flip-flops to behave the same:

!like
!flip-flops

However this results only in like being picked up as a forbidden word and not flip-flops.

Note that if we change the sentence:

Cows don't like flip-flops very much.

then now flip-flops is rejected.

It's super useful that hyphenated words can be forbidden, however the presence of the hyphen appears to result in different behaviour.

Steps to Reproduce

No response

Expected Behavior

Forbidden hyphenated words to be forbidden even when followed by punctuation.

Additional Information

No response

cspell.json

{
  "$schema": "https://raw.githubusercontent.com/streetsidesoftware/cspell/main/cspell.schema.json",
  "version": "0.2",
  "dictionaryDefinitions": [
    {
      "name": "project-words",
      "path": "./.spelling",
      "addWords": true
    }
  ],
  "dictionaries": ["en-gb", "project-words"],
  "useGitignore": true
}

cspell.config.yaml

No response

Example Repository

https://github.com/srobo/website/

Code of Conduct

  • I agree to follow this project's Code of Conduct
@Jason3S
Copy link
Collaborator

Jason3S commented Nov 26, 2024

@PeterJCLaw,

Thank you. Great example.

@Jason3S
Copy link
Collaborator

Jason3S commented Nov 27, 2024

@PeterJCLaw,

The cause is that a . can be part of a word. I'll look at fixing it.

The work around is to add both versions:

!like
!flip-flops
!flip-flops.

@PeterJCLaw
Copy link
Author

You've probably already seen this but just for the record -- if we add !much to ban much then using the examples above, that does correctly find and warn about the occurrence of that just before the ..

For completeness the current behaviour is:

!like
!much
!flip-flops
Cows don't like flip-flops.
           ^^^^
Cows don't like flip-flops very much.
           ^^^^ ^^^^^^^^^^      ^^^^

As you mention, the workaround is to add the trailing punctuation into a variant of the banned word.

@Jason3S
Copy link
Collaborator

Jason3S commented Nov 27, 2024

@PeterJCLaw,

You just confirmed what I thought was going on.

It has to do with how the spell checker breaks up words. -, _, ., and 0-9 are considered part of words.

In the case of much. It first looks up much. and cannot find it. It then tries [much,.]. Since much is forbidden, it is flagged.

In the case of flip-flops, if it sees "flip-flops" it will flag it. But when there is a . and the end, it looks at flip-flops. as a whole. Since that is not in the dictionary, It breaks it up into pieces and looks at each piece: flip, -, flops, .. Since they are all ok, it doesn't flag anything.

For example flip-flop-flip should be ok even though it contains flip-flop.

I think it makes sense to strip off the . and check to see if the word is forbidden.

Since much is forbidden, it will pick up munch-much-munch and flag much.

@PeterJCLaw
Copy link
Author

Thanks!

Are there any other punctuation which should have similar treatment? Perhaps ,? Though I'm actually unclear why this wouldn't extend to e.g: :, ;, ?, ! -- any which wouldn't normally be expected mid-word.

@Jason3S
Copy link
Collaborator

Jason3S commented Nov 27, 2024

@PeterJCLaw,

These are the special characters allowed '’`.+- in addition to letters and accents.

Period . is allowed because it is part of abbreviations. Where a word might only be valid if it is followed by a ..

Copy link
Contributor

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants