A tool for finding and fixing (TODO) spelling errors in Wikipedia caused by missing diacritics.
Install dependencies with poetry:
$ poetry install
Run the tool:
$ poetry run diacritical
Example output:
The tool outputs the word under investigation, followed by the list of potential issues found and the count.
Kererū:
Flower: https://en.wikipedia.org/wiki/Flower
Panaruawhiti / Endeavour Inlet: https://en.wikipedia.org/wiki/Panaruawhiti_%2F_Endeavour_Inlet
2 articles with potential misspellings found.
Config is done using TOML files, in the config directory.
["Kererū"] # The correct spelling of the misspelled word.
skip = true # Whether to skip processing the word. (optional, default: false)
ignored_patterns = ["Kereru-?Symes", "Count Kereru"] # A list of patterns to ignore. (optional, default: empty list)
ignored_pages = ["Wellington and Manawatu Railway Company"] # A list of pages to ignore. (optional, default: empty list)
Run the linting and testing tools as follows:
$ poetry run black --check .
$ poetry run flake8
$ poetry run coverage run --branch -m unittest
$ poetry run coverage report --fail-under 100