Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[4.1] Language of Parts [a11y] #35607

Merged
merged 13 commits into from
Nov 21, 2021
Merged

Conversation

brianteeman
Copy link
Contributor

@brianteeman brianteeman commented Sep 19, 2021

This PR is an implementation of the new "language of parts" feature in tinyMCE

It has been made to 4.1 as its a new feature

This PR requires #35605 which at the time of this PR has not been merged into 4.1dev

This is a replacement PR of #30939

This wraps the desired text in span tags with a lang attribute for the specified language. Unspecified text is assumed to be written in the page's language. This helps the resulting text comply with WCAG 2.0 3.1.2 Language of Parts: "The human language of each passage or phrase in the content can be programmatically determined... https://www.w3.org/WAI/WCAG21/Techniques/html/H58.html"

If you have customised your tinymce editor toolbar you will need to edit the toolbar again to include this button.

The need for this is based on the EU funded research project for improving the process of creating accessible content by authors https://accessibilitycluster.com/about

You can view a video that demonstrates the benefits of this feature https://www.youtube.com/watch?v=BY9_xhjtLV4 and read the Technical Sepecification that the research project produced.

The list of languages to be included is a user defined subform in the plugin.

In addition the research project recommended that there should be some form of visible indicator to the content author that a piece of text has been marked as being in a specific language. TinyMCE have not implemented this (yet) so I have done it with some css for the editor

cc @chmst @bembelimen

image

image

image

image

This PR is a draft to implement and showcase the new language of parts feature in tinyMCE

It has been made to 4.1 as its a new feature

This PR requires joomla#35605 which at the time of this PR has not been merged into 4.1dev

This is a replacement PR of joomla#30939

This wraps the desired text in span tags with a lang attribute for the specified language. Unspecified text is assumed to be written in the page's language. This helps the resulting text comply with WCAG 2.0 3.1.2 Language of Parts: "The human language of each passage or phrase in the content can be programmatically determined..."

If you have customised your tinymce editor toolbar you will need to edit the toolbar again to include this button.

The need for this is based on the EU funded research project for improving the process of creating accessible content by authors https://accessibilitycluster.com/about

You can view a video that demonstrates the benefits of this feature https://www.youtube.com/watch?v=BY9_xhjtLV4 and read the [Technical Sepecification](https://www.dropbox.com/s/mbzh30rdt0c0gqa/Technical%20specification%20-%20Change%20language%20%28We4Authors%20Cluster%29.pdf) that the research project produced.

This PR is only a draft

The remaining task is to decide
- which languages to list
- if they should be translatable
- should they be fr for fr-FR
- or should the list of languages be user selectable in the tinymce plugin configuration

In addition the research project recommended that there should be some form of visible indicator to the content author that a piece of text has been marked as being in a specific language. TinyMCE have not implemented this (yet) but I think we could do it with some css?

cc @chmst
@bembelimen
Copy link
Contributor

bembelimen commented Sep 22, 2021

Thanks for this PR, very useful.

The remaining task is to decide

* which languages to list

"All", but preselected with the, let's say 10 most spoken one?

* if they should be translatable

Yes I would think so.

* should they be fr or fr-FR

Joomla! is using the later for the html tag, so we should stick with it.
Side comment: while reading this it clearly recommends en-GB when using two sections, but Joomla! is shipping en-gb in e.g. Cassiopeia, probably worth to change in another PR...

* or should the list of languages be user selectable in the tinymce plugin configuration

If possible we should ship a list of X languages (see first point) and user can add more, as a repeatable subform-field in the tinymce plugin parameter.

In addition the research project recommended that there should be some form of visible indicator to the content author that a piece of text has been marked as being in a specific language. TinyMCE have not implemented this (yet) but I think we could do it with some css?

Yes, the question is, what should be displayed?
The tag like https://jsfiddle.net/a72gkqxe/ or an icon?

@bembelimen bembelimen added the a11y Accessibility label Sep 22, 2021
@brianteeman
Copy link
Contributor Author

the ten most spoken is probably not going to work as they won't be the top ten languages likely to be used on a web site.

For example this is a top ten list of spoken languages
https://www.statista.com/statistics/266808/the-most-spoken-languages-worldwide/

As you can see french and german do not appear on the list

This is a list of the top ten list of languages on the internet
https://speakt.com/top-10-languages-used-internet/

French and german do appear but so do other languages that might appear "strange" in a list.

I think I will try to create it so that its a repeatable subform-field with some prefilled languages (those using latin character sets)

Yes, the question is, what should be displayed?

this was the part of the research project that we had no recommendation for - perhaps why tinymce didnt implement anything. I will try what you propose.

regarding the capitalisation you highlighted. I am reviewing the documentation from the w3c i18n working group for clarification https://www.w3.org/International/

@brianteeman
Copy link
Contributor Author

regarding the capitalisation.
I don't know why but since the beginning of the system language pluginten years ago the region subtags were converted to lowercase.

which was probably following the code here

/**
* Sets the global document language declaration. Default is English (en-gb).
*
* @param string $lang The language to be set
*
* @return Document instance of $this to allow chaining
*
* @since 1.7.0
*/
public function setLanguage($lang = 'en-gb')
{
$this->language = strtolower($lang);
return $this;
}

I spoke to the head of the w3c i18n working group who pointed me to https://www.w3.org/International/questions/qa-choosing-language-tags

Think about letter-case. By convention, primary language subtags are lowercase, script subtags begin with an uppercase letter, and continue with lowercase, and region subtags are uppercase. This is only a convention, however, and you are free to use whatever letter-casing you like.

@bembelimen bembelimen added this to the Joomla 4.1 milestone Sep 22, 2021
@chmst
Copy link
Contributor

chmst commented Sep 22, 2021

Very good improvement. Highlighting language parts surely is possible via css, at least with js - maybe by adding the respective flag?

which languages to list

I think, the languages should be configurable in the tinyMCE, at least a few ones. English of course it always needed but then it depends. In Europe / America your proposed languages seem to be ok. But in Asian countries, other languages could be more impotant. Opinions from native speakers?

Unspecified text is assumed to be written in the page's language.

You mean the frontend language?

@brianteeman
Copy link
Contributor Author

Unspecified text is assumed to be written in the page's language.

You mean the frontend language?

The language of the page as defined in the <html lang=

@infograf768
Copy link
Member

infograf768 commented Sep 23, 2021

Are'nt the languages possibilities defined by the screenreader itself?

Example with MacOS Voice Over:

Screen Shot 2021-09-23 at 08 20 11

Remark that evidently Chinese is different for Taiwan and China (zh-TW and zh-CN in Joomla, also for Hong Kong, not sh btw).

I found a list of supported languages in most screenreaders and tags to use.
https://accessibility.psu.edu/foreignlanguages/langtaghtml/
IMHO, we should list all of these in this PR.

@brianteeman
Copy link
Contributor Author

No that is not the way to do it. That assumes that the content author knows about or even has a screen reader. "Language of parts" is about much more than just a screen reader or selecting a voice. There is also no connection between "language of parts" and any joomla language packs. Finally to be usable there is no point in having a huge list of languages available if the content author is only going to need a select few.

@infograf768
Copy link
Member

Wonder why I even try again to help...
Any clever people would have understood that I never said there was a connection between language of parts and our packs. Just explaining that one cannot use the anyway wrong sh for Chinese but has to use the full tags of the 3 common variations instead.
If the Content author has no idea about the feature, she/he would not know which tags to use. This has to be set by the site administrator.
If the site admin knows about the feature and can add these tags as parameters in Tiny, all is fine.
Harcoding a limited list of languages is useless.

@brianteeman
Copy link
Contributor Author

I guess you never actually read the original post. Oh well. At least others did and commented appropriately which is greatly appreciated.

@joomla-cms-bot joomla-cms-bot added the Language Change This is for Translators label Sep 26, 2021
@brianteeman brianteeman marked this pull request as ready for review September 26, 2021 14:16
@brianteeman
Copy link
Contributor Author

This is now ready for testing etc

@wojsmol
Copy link
Contributor

wojsmol commented Sep 26, 2021

@brianteeman There is a merge conflict in package-lock.json.

@chmst
Copy link
Contributor

chmst commented Oct 24, 2021

A very good and appreciated feature. In my test (using patchtester) I had an effect where I am not sure if it is the new function or a general tinyMCE issue.

If I mark several parts, the are not correct.

I had the following text:
this is a french text: honni soit qui mal y pense and it's pendant in German: Ein Schelm, wer Böses dabei denkt 

image

Marked
"honni soit qui mal y pense" as French
"pendant" as French
"Ein Schelm, wer Böses dabei denkt" as German

the result is
<p>this is a french text: honni soit qui mal y pense<span lang="fr"> and it's </span>pendant in German: Ein Schelm, wer Böses dabei denkt </p>

it should be

<p>this is a french text: <span lang="fr">honni soit qui mal y pense</span>and it's <span lang="fr">pendant </span> in German: <span lang="de">Ein Schelm, wer Böses dabei denkt </span></p>

@brianteeman
Copy link
Contributor Author

Could you try and upload your video again. I suspect its a tinymce upstream issue but need to see it

@chmst
Copy link
Contributor

chmst commented Oct 24, 2021

Problem with the first video, so I repeated the whole. The result is different.

language-parts

@brianteeman
Copy link
Contributor Author

The advice from tinymce is to apply the span after you have entered the text

@chmst
Copy link
Contributor

chmst commented Oct 24, 2021

I have tested this item ✅ successfully on 75a0575

The test is successful if the parts are marked after the text has been written.
Maybe an extra hint could be added to the documentation.


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/35607.

@richard67
Copy link
Member

This PR requires #35605 which at the time of this PR has not been merged into 4.1dev

I think that part of the description is not right anymore.

@richard67
Copy link
Member

richard67 commented Nov 20, 2021

I have tested this item ✅ successfully on 75a0575

It works like I expected. I select the content and mark it with the desired language. It uses a span to wrap the text, which is quiet valid because an inline element, and that's what w3c recommended as far as I could read in another context (inline bidirectional markup).

It can be even nested, i.e. I can mark a text as English and inside that mark a German part as German.

The result is <p><span lang="en-GB">This is a test. <span lang="de-DE">Dies ist ein Test.</span> This is a Test.</span></p>

That's quite valid (but maybe not a very useful example).


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/35607.

@joomla-cms-bot joomla-cms-bot removed this from the Joomla 4.1 milestone Nov 20, 2021
@richard67
Copy link
Member

RTC


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/35607.

@joomla-cms-bot joomla-cms-bot added the RTC This Pull Request is Ready To Commit label Nov 20, 2021
@richard67 richard67 added this to the Joomla 4.1 milestone Nov 20, 2021
@richard67
Copy link
Member

Regarding documentation: I am not sure if we should rewrite TinyMCE's documentation. Maybe we should just link to that (if we haven't done yet). The PR is just enabling a builtin TinyMCE feature for us.

Another interesting question is if that could be extended by an optional field for the direction of a language and if set, add the "dir" attribute to that span, too.

@brianteeman
Copy link
Contributor Author

Another interesting question is if that could be extended by an optional field for the direction of a language and if set, add the "dir" attribute to that span, too.

There is a separate tinymce button for that
image

@richard67
Copy link
Member

richard67 commented Nov 20, 2021

I see ... well assume (not tested yet) that would wrap a span with a language attribute into another one with the direction attribute or vice versa, depending on order of processing, which is not nice but valid. It would be nicer to have the right dir attribute directly in the language span.

@brianteeman
Copy link
Contributor Author

Take it up with tinymce ;)

@brianteeman
Copy link
Contributor Author

Seriously the reason they are separate is that you should not add dir=ltr if the rest is already ltr etc. And there is no reliable way to know what the direction of the text is when you mark something as being in a different language. This is the correct way to do this.

@bembelimen bembelimen merged commit d78a4c1 into joomla:4.1-dev Nov 21, 2021
@joomla-cms-bot joomla-cms-bot removed the RTC This Pull Request is Ready To Commit label Nov 21, 2021
@bembelimen
Copy link
Contributor

Thx

@brianteeman
Copy link
Contributor Author

thats great - thank you

@brianteeman brianteeman deleted the language_of_parts branch November 21, 2021 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a11y Accessibility Language Change This is for Translators NPM Resource Changed This Pull Request can't be tested by Patchtester
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants