-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core(seo): support spanish in link-text audit #7547
Conversation
Words included: - click here - here - more - more information - this - link - this link - start
There was a little space in the last line which I had deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much for the contribution @MarkelFe!
Core team: We sortof accepted the Japanese contribution without a lot of process, but if we're going to be accepting more languages we might want a bit more consideration.
Maybe we say just top 10 languages on the internet welcome or something?
Maybe we want at least 2 native speakers to agree the sets are reasonable and not too wide?
Just thinking out loud.
@@ -25,6 +25,25 @@ const BLOCKLIST = new Set([ | |||
'続きを読む', | |||
'続く', | |||
'全文表示', | |||
// Spanish | |||
'click aquí', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a fair amount of duplication with the permutations, I wonder if we could do some replacements to keep the signal in here high?
could be as simple as an easy map for now
const CHARACTER_REPLACEMENTS = new Map([
[/í/g, 'i'],
[/á/g, 'a'],
[/ó/g, 'o'],
])
function normalizeText(text) {
let normalized = text.trim().toLowerCase();
for (const [regex, replacement] of CHARACTER_REPLACEMENTS.entries()) {
normalized = normalized.replace(regex, replacement)
}
return normalized;
}
...
return BLOCKLIST.has(normalizeText(link.text));
DISCLAIMER: untested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could do some replacements to keep the signal in here high?
ha, the problem is with this change I would wonder if normalizeText
is quite correct or if it's missing something or adding things I'm not expecting, versus just scanning a medium sized list :)
But I'd be ok with tests for it 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Core team: We sortof accepted the Japanese contribution without a lot of process, but if we're going to be accepting more languages we might want a bit more consideration.
yeah, considering we barely had a process for the original list of english terms, I really don't know what the best approach is here
Maybe we want at least 2 native speakers to agree the sets are reasonable and not too wide?
seems like a good idea to me
@@ -25,6 +25,25 @@ const BLOCKLIST = new Set([ | |||
'続きを読む', | |||
'続く', | |||
'全文表示', | |||
// Spanish | |||
'click aquí', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could do some replacements to keep the signal in here high?
ha, the problem is with this change I would wonder if normalizeText
is quite correct or if it's missing something or adding things I'm not expecting, versus just scanning a medium sized list :)
But I'd be ok with tests for it 👍
I have added a couple of translations for the Links Do Not Have Descriptive Text audit in Spanish.
Note that in Spanish there is this symbol:
´
. People forget to write it so we may take into account these errors, let me know your opinion.Related Issues/PRs
#5322 but in Spanish