Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore spell check in hyperlinks #24

Closed
MarekLani opened this issue Oct 16, 2020 · 7 comments
Closed

Ignore spell check in hyperlinks #24

MarekLani opened this issue Oct 16, 2020 · 7 comments
Assignees
Labels
question Further information is requested

Comments

@MarekLani
Copy link

Please is there a way how to ignore spell check of hyperlinks?
link text

@jonasbn
Copy link
Collaborator

jonasbn commented Oct 18, 2020

Meaning you want to check the spelling of the "link text", but not the actual URL?

@jonasbn jonasbn self-assigned this Oct 18, 2020
@jonasbn jonasbn added the question Further information is requested label Oct 18, 2020
@jonasbn
Copy link
Collaborator

jonasbn commented Oct 18, 2020

Hi @MarekLani

I did a test, running pyspelling manually on a HTML file, this does not report any spelling errors for the HTML part, only the text part.

$ clear; pyspelling --config spellcheck.yaml
Misspelled words:
<htmlcontent> index.html: html>body
--------------------------------------------------------------------------------
baader
evin
speeling
--------------------------------------------------------------------------------

!!!Spelling check failed!!!

HTML file:

<a href="/baad_speling/">evin baader speeling</a>

Using the configuration outlined here:

matrix:
- name: HTML
  aspell:
    lang: en
  dictionary:
    encoding: utf-8
  pipeline:
  - pyspelling.filters.html:
    comments: false
  sources:
  - '**/*.html'
  default_encoding: utf-8

Could you please provide me with a copy of you configuration file?

@MarekLani
Copy link
Author

@jonasbn thank you for response and sorry I should have stated, that this is present in md check. I meant following:
[Link text to be checked](link_address_should_not_be_checked)

This is my config:

matrix:
- name: Markdown
  aspell:
    lang: en
  dictionary:
    wordlists:
    - wordlist.txt
    encoding: utf-8
  pipeline:
    - pyspelling.filters.markdown:
      markdown_extensions:
      - markdown.extensions.extra:
    - pyspelling.filters.html:
        comments: true
        attributes:
        - title
        - alt
        ignores:
        - ':matches(code, pre)'
        - 'code'
        - 'pre'
  sources:
  - '**/*.md'
  default_encoding: utf-8

@jonasbn
Copy link
Collaborator

jonasbn commented Oct 19, 2020

Hi @MarekLani

I have tested against this example file using your provided config:

[evin baader speeling](/baad_speling/)
$ pyspelling --config spellcheck2.yaml
Misspelled words:
<htmlcontent> index.md: html>body>p
--------------------------------------------------------------------------------
baader
evin
speeling
--------------------------------------------------------------------------------

!!!Spelling check failed!!!

The link text is checked, not the URL part.

Could you possibly provide me with more data on what you observe, since I cannot reproduce what you request.

Do note that pyspelling converts Markdown to HTML before doing the check, hence the HTML output pointing to the DOM.

From the example above

index.md: html>body>p

@jonasbn
Copy link
Collaborator

jonasbn commented Oct 24, 2020

Hi @MarekLani

I have responded to your question and I have demonstrated use of the software and it's expected behaviour, so I am closing this issue.

@jonasbn jonasbn closed this as completed Oct 24, 2020
@rishitc
Copy link

rishitc commented Jul 8, 2021

I'm not sure if the issue still exists for the original author, but I ran into this very issue some time ago and the configuration file discussed here (quoted below) where the HTML filter (with some configurations) is used after the markdown filter, completely solved the issue for me 👍

@jonasbn thank you for response and sorry I should have stated, that this is present in md check. I meant following:
[Link text to be checked](link_address_should_not_be_checked)

This is my config:

matrix:
- name: Markdown
  aspell:
    lang: en
  dictionary:
    wordlists:
    - wordlist.txt
    encoding: utf-8
  pipeline:
    - pyspelling.filters.markdown:
      markdown_extensions:
      - markdown.extensions.extra:
    - pyspelling.filters.html:
        comments: true
        attributes:
        - title
        - alt
        ignores:
        - ':matches(code, pre)'
        - 'code'
        - 'pre'
  sources:
  - '**/*.md'
  default_encoding: utf-8

@facelessuser
Copy link

Yes, to avoid <a> tags, you would use the HTML filter. The Markdown filter is just used to convert Markdown to HTML. If you need to avoid URL in plain text, the URL filter can help with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants