Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ignore words become case-insensitive if it is only lowercase. #3638

Open
fujitatomoya opened this issue Feb 8, 2025 · 5 comments · May be fixed by #3658
Open

ignore words become case-insensitive if it is only lowercase. #3638

fujitatomoya opened this issue Feb 8, 2025 · 5 comments · May be fixed by #3658
Labels

Comments

@fujitatomoya
Copy link

according to https://github.com/codespell-project/codespell?tab=readme-ov-file#ignoring-words,

When ignoring false positives, note that spelling errors are case-insensitive but words to ignore are case-sensitive.

but the following example tells me that is not implemented when word to ignore is only with lowercase.

tomoyafujita@~/DVT/codespell >cat test.md 
ros2 should not be detected.
ROS2 should be detected.
tomoyafujita@~/DVT/codespell >cat dictionary.txt 
ROS2->ROS 2
tomoyafujita@~/DVT/codespell >codespell -D dictionary.txt,- test.md
test.md:1: ros2 ==> ros 2
test.md:2: ROS2 ==> ROS 2
tomoyafujita@~/DVT/codespell >codespell --ignore-words-list ros2 -D dictionary.txt,- test.md
tomoyafujita@~/DVT/codespell >codespell --ignore-words-list ROS2 -D dictionary.txt,- test.md
test.md:1: ros2 ==> ros 2

i believe this is unexpected behavior, at least it is not consistent with documentation above.

@fujitatomoya
Copy link
Author

#2323 could be related.

@DimitriPapadopoulos
Copy link
Collaborator

DimitriPapadopoulos commented Feb 19, 2025

The relevant code was introduced by #3272:

def process_ignore_words(
words: Iterable[str], ignore_words: Set[str], ignore_words_cased: Set[str]
) -> None:
for word in words:
word = word.strip()
if word == word.lower():
ignore_words.add(word)
else:
ignore_words_cased.add(word)

I seem to recall that the spirit of that change is that lowercase words to ignore are handled in a case-insensitive way (ignore_words), while words with uppercase letters (ignore_words_cased) are handled in a case-sensitive way. Does this make sense to you?

In any case, you're right: at the very least the documentation should have been updated.

@DimitriPapadopoulos
Copy link
Collaborator

While #2323 is similar, I don't think it's directly related.

@fujitatomoya
Copy link
Author

@DimitriPapadopoulos thanks for checking on this, really appreciate that.

I seem to recall that the spirit of that change is that lowercase words to ignore are handled in a case-insensitive way (ignore_words), while words with uppercase letters (ignore_words_cased) are handled in a case-sensitive way. Does this make sense to you?

unfortunately no, it does not work for our use case.
what we want to do is that ignoring the words in a case-sensitive way including lower cases only word. in this particular case, we want to ignore ros2 only, but ROS2, RoS2 or Ros2.

i think this is inconsistent behavior for user perspective.

tomoyafujita@~/DVT/codespell >cat test.md 
ros2 should not be detected.
ROS2 should be detected.
tomoyafujita@~/DVT/codespell >cat dictionary.txt 
ROS2->ROS 2
tomoyafujita@~/DVT/codespell >codespell -D dictionary.txt,- test.md
test.md:1: ros2 ==> ros 2
test.md:2: ROS2 ==> ROS 2
tomoyafujita@~/DVT/codespell >codespell --ignore-words-list ros2 -D dictionary.txt,- test.md
tomoyafujita@~/DVT/codespell >codespell --ignore-words-list ROS2 -D dictionary.txt,- test.md
test.md:1: ros2 ==> ros 2

When ignoring false positives, note that spelling errors are case-insensitive but words to ignore are case-sensitive. makes sense to me, that gives the flexibility for the user to filter out the specific words in a case-sensitive way?

@fujitatomoya fujitatomoya linked a pull request Mar 2, 2025 that will close this issue
@fujitatomoya
Copy link
Author

@DimitriPapadopoulos

could you take a look at #3658?

since this is the 1st time for me to make PR against codespell-project, i would be missing something. but i did went through code and documentation, and i checked local verification is passing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants