[Feature] Web Speech API in Javascript #8794

mikunimaru · 2021-05-08T06:27:22Z

As a solution to all the functional requirements related to TTS, I would like to propose support for the Web Speech API in JavaScript.

By supporting the Web Speech API, it will be possible to realize the speech behavior of the cloze-only tag, place the speech button in the card, and speak the text multiple times at different speeds using JavaScript on the user side.

This has the advantage of keeping the application side simple by leaving the implementation of overly complex functions related to TTS to the user's JavaScript side.

Below is the sample code.

<p><select id="voice"></select></p>
<p><textarea id="textarea">This is TTS test.</textarea></p>
<p><button id="button1">Speak</button>
  <button id="button2">Stop</button></p>

<script> 
// CC0 http://creativecommons.org/publicdomain/zero/1.0/
if (window.speechSynthesis) {
  let voices = [];
  function setVoices() {
    if (voices.length) return;
    voices = speechSynthesis.getVoices();
    if (!voices.length) return;
    voices
      .filter(v => v.lang.startsWith("en"))
      .forEach(v => {
        let opt = document.createElement("option");
        opt.text = v.name;
        opt.voice = v;
        voice.appendChild(opt);
      });
  }
  speechSynthesis.addEventListener("voiceschanged", setVoices);
  setVoices();
}
button1.addEventListener("click", () => {
  let opt = voice.selectedOptions;
  if (!opt.length) return;
  let u = new SpeechSynthesisUtterance(textarea.value);
  u.voice = opt[0].voice;
  u.lang  = u.voice.lang;
  u.addEventListener("boundary", e => {
    if (e.name != "word") return;
    textarea.focus();
    textarea.setSelectionRange(e.charIndex, e.charIndex + e.charLength);
  });
  u.addEventListener("end", () => textarea.setSelectionRange(0, 0));
  speechSynthesis.speak(u);
});
button2.addEventListener("click", () => {
  if (!window.speechSynthesis) return;
  speechSynthesis.cancel();
});
 </script>

I've also prepared a page where you can easily try out the code.
https://codepen.io/mikunimaru/pen/poevqKq
The above code is executable in my android browser, so I think android is capable of supporting Web Speech API.

welcome · 2021-05-08T06:27:24Z

Hello! 👋 Thanks for logging this issue. Please remember we are all volunteers here, so some patience may be required before we can get to the issue. Also remember that the fastest way to get resolution on an issue is to propose a change directly, https://github.com/ankidroid/Anki-Android/wiki/Contributing

krmanik · 2021-05-08T08:20:13Z

Have you tried this one https://github.com/ankidroid/Anki-Android/wiki/FAQ#to-use-tts-on-ankidesktop-and-ankidroid
?
It is working for cloze also.

SpeechSynthesisUtterance is not supported in android webview. So, above code will not work inside AnkiDroid reviewer. May be JAVA implementation will be done to create TTS. Also after rust conversion the TTS will be improved in latest release of AnkiDroid.
https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesisUtterance

mikunimaru · 2021-05-08T09:15:47Z

Have you tried this one https://github.com/ankidroid/Anki-Android/wiki/FAQ#to-use-tts-on-ankidesktop-and-ankidroid
?
It is working for cloze also.

SpeechSynthesisUtterance is not supported in android webview. So, above code will not work inside AnkiDroid reviewer. May be JAVA implementation will be done to create TTS. Also after rust conversion the TTS will be improved in latest release of AnkiDroid.
https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesisUtterance

In the current app specification, there is no way to achieve the behavior of speaking only the opened part of the text when cloze is opened.

Is it possible to use Chrome Custum Tab instead of WebView to enable rich JavaScript?
https://developer.chrome.com/docs/android/custom-tabs/overview/

The availability of powerful JavaScript provides great extensibility to AnkiDroid, which has no add-ons.

krmanik · 2021-05-08T09:45:02Z

In the current app specification, there is no way to achieve the behavior of speaking only the opened part of the text when cloze is opened.

May be in AnkiDroid 2.15 or 2.16 the TTS will improved greatly even for cloze also.

Is it possible to use Chrome Custum Tab instead of WebView to enable rich JavaScript?
https://developer.chrome.com/docs/android/custom-tabs/overview/

I don't think so but you can submit PR.

The availability of powerful JavaScript provides great extensibility to AnkiDroid, which has no add-ons.

The js addons support will be available for AnkiDroid soon.

mikunimaru · 2021-05-09T02:08:04Z

I've read the commentary and understood that the Web Speech API is difficult to use.
Then what about an alternative: providing an api that can freely call AnkiDroid's TTS from javascript in the template.

This api provides a lot of custom freedom with TTS functionality.

mikunimaru · 2021-05-10T06:50:54Z

I tried to add TTS API for JS by a simple modification.
mikunimaru@b04f7cd

SVID_20210510_152829.mp4

<p><textarea id="textarea">This is TTS test.</textarea></p>
<p><button id="button1">Speak</button>

  <script> 
// CC0 http://creativecommons.org/publicdomain/zero/1.0/
var jsApi = {"version" : "0.0.1", "developer" : "[email protected]"};
var apiStatus = AnkiDroidJS.init(JSON.stringify(jsApi));

button1.addEventListener("click", () => {
 const text = document.getElementById("textarea").value
 AnkiDroidJS.speak_experimental(text);
});
 </script>

Calling TTS seems to work very well.
Please consider adding a JavaScript api. (Adding an api is too difficult for me because I don't have enough understanding of the code to estimate the changes)

krmanik · 2021-05-10T07:17:19Z

You should create PR for this, you have already implemented it. But first wait for maintainers of AnkiDroid to give feedback on this features.

I have also tried code like this. It works well. You may take some ideas from it. This function in AbstractFlashcardViewer.java
The benefits of this will be custom language, tts engine, pitch and speed can be added/changed from user's js code.

 @JavascriptInterface
        public void ankiTextToSpeech(String text, String local) {
            ankiJStts = new TextToSpeech(getApplicationContext(), new TextToSpeech.OnInitListener() {
                @SuppressLint("NewApi")
                @Override
                public void onInit(int i) {
                    if (i == TextToSpeech.SUCCESS) {

                        int result = ankiJStts.setLanguage(new Locale(local));
                        if (result == TextToSpeech.LANG_MISSING_DATA || result == TextToSpeech.LANG_NOT_SUPPORTED) {
                            Timber.i("Not supported or missing language data.");

                        } else {
                            ankiJStts.speak(text, TextToSpeech.QUEUE_ADD, null, "0000000");
                        }

                    }
                }
            }, "com.google.android.tts");
        }

Usage:

<script> 
AnkiDroidJS.ankiTextToSpeech("你好！，世界", "zh_CN"); 
</script>

krmanik · 2021-05-10T07:23:34Z

Also this is old message in discord for this features from @mikehardy.
that TTS stuff looks very interesting! If it works well and I wonder if we can "bend" the built-in / "core" TTS implemented in AnkiDroid to simply emit the necessary HTML/JS using that.

mikunimaru · 2021-05-10T07:37:37Z

The video below is the cloze Speak on the user side that I first commented on.

SVID_20210510_161354.mp4

<script> 
var jsApi = {"version" : "0.0.1", "developer" : "[email protected]"};
var apiStatus = AnkiDroidJS.init(JSON.stringify(jsApi));

var cloze = document.querySelector(".cloze"); 
AnkiDroidJS.speak_experimental(cloze.textContent);
</script>

If the JavaScript TTS api is implemented, AnkiDroid will become even more powerful.
I'm a novice when it comes to JAVA, so I'd rather leave the creation of pull requests to users who are familiar with JAVA.

mikehardy · 2021-05-11T13:32:08Z

Sure, propose a PR exposing TTS features via javascript, that's fine by me - just please make them generic so they are generally useful. I think that's a small hurdle though - I believe the only things we really do with TTS are start/stop the engine, start/stop speaking, add text to the speaking queue (?), and set a default TTS language for a deck (?)

github-actions · 2021-07-10T13:36:46Z

Hello 👋, this issue has been opened for more than 2 months with no activity on it. If the issue is still here, please keep in mind that we need community support and help to fix it! Just comment something like still searching for solutions and if you found one, please open a pull request! You have 7 days until this gets closed automatically

david-allison · 2021-11-08T00:47:54Z

@mikunimaru FYI: We're in the process of accepting Anki Desktop style TTS.

Anki-Android/AnkiDroid/src/main/java/com/ichi2/libanki/Sound.kt

Lines 27 to 37 in 253706a

    
           /** 
        
            * Records information about a text to speech tag. 
        
            */ 
        
           data class TTSTag( 
        
               val fieldText: str, 
        
               val lang: str, 
        
               val voices: List<str>, 
        
               val speed: Float, 
        
               /** each arg should be in the form 'foo=bar' */ 
        
               val otherArgs: List<str> 
        
           ) : AvTag()

I believe that this JS API is compatible with this class, but flagging it up in case we need to make changes before 2.16 goes out.

krmanik added the JS API label May 8, 2021

mikunimaru mentioned this issue May 12, 2021

New JavaScript api for TTS #8812

Merged

7 tasks

github-actions bot added the Stale label Jul 10, 2021

github-actions bot closed this as completed Jul 17, 2021

david-allison added this to the 2.16 release milestone Nov 8, 2021

david-allison removed the Stale label Nov 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Web Speech API in Javascript #8794

[Feature] Web Speech API in Javascript #8794

mikunimaru commented May 8, 2021

welcome bot commented May 8, 2021

krmanik commented May 8, 2021 •

edited

Loading

mikunimaru commented May 8, 2021

krmanik commented May 8, 2021 •

edited

Loading

mikunimaru commented May 9, 2021

mikunimaru commented May 10, 2021

krmanik commented May 10, 2021 •

edited

Loading

krmanik commented May 10, 2021 •

edited

Loading

mikunimaru commented May 10, 2021

mikehardy commented May 11, 2021

github-actions bot commented Jul 10, 2021

david-allison commented Nov 8, 2021 •

edited

Loading

[Feature] Web Speech API in Javascript #8794

[Feature] Web Speech API in Javascript #8794

Comments

mikunimaru commented May 8, 2021

welcome bot commented May 8, 2021

krmanik commented May 8, 2021 • edited Loading

mikunimaru commented May 8, 2021

krmanik commented May 8, 2021 • edited Loading

mikunimaru commented May 9, 2021

mikunimaru commented May 10, 2021

krmanik commented May 10, 2021 • edited Loading

krmanik commented May 10, 2021 • edited Loading

mikunimaru commented May 10, 2021

mikehardy commented May 11, 2021

github-actions bot commented Jul 10, 2021

david-allison commented Nov 8, 2021 • edited Loading

krmanik commented May 8, 2021 •

edited

Loading

krmanik commented May 8, 2021 •

edited

Loading

krmanik commented May 10, 2021 •

edited

Loading

krmanik commented May 10, 2021 •

edited

Loading

david-allison commented Nov 8, 2021 •

edited

Loading