Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Web Speech API in Javascript #8794

Closed
mikunimaru opened this issue May 8, 2021 · 12 comments · Fixed by #8812
Closed

[Feature] Web Speech API in Javascript #8794

mikunimaru opened this issue May 8, 2021 · 12 comments · Fixed by #8812
Labels
Milestone

Comments

@mikunimaru
Copy link
Contributor

As a solution to all the functional requirements related to TTS, I would like to propose support for the Web Speech API in JavaScript.

By supporting the Web Speech API, it will be possible to realize the speech behavior of the cloze-only tag, place the speech button in the card, and speak the text multiple times at different speeds using JavaScript on the user side.

This has the advantage of keeping the application side simple by leaving the implementation of overly complex functions related to TTS to the user's JavaScript side.

Below is the sample code.

<p><select id="voice"></select></p>
<p><textarea id="textarea">This is TTS test.</textarea></p>
<p><button id="button1">Speak</button>
  <button id="button2">Stop</button></p>

<script> 
// CC0 http://creativecommons.org/publicdomain/zero/1.0/
if (window.speechSynthesis) {
  let voices = [];
  function setVoices() {
    if (voices.length) return;
    voices = speechSynthesis.getVoices();
    if (!voices.length) return;
    voices
      .filter(v => v.lang.startsWith("en"))
      .forEach(v => {
        let opt = document.createElement("option");
        opt.text = v.name;
        opt.voice = v;
        voice.appendChild(opt);
      });
  }
  speechSynthesis.addEventListener("voiceschanged", setVoices);
  setVoices();
}
button1.addEventListener("click", () => {
  let opt = voice.selectedOptions;
  if (!opt.length) return;
  let u = new SpeechSynthesisUtterance(textarea.value);
  u.voice = opt[0].voice;
  u.lang  = u.voice.lang;
  u.addEventListener("boundary", e => {
    if (e.name != "word") return;
    textarea.focus();
    textarea.setSelectionRange(e.charIndex, e.charIndex + e.charLength);
  });
  u.addEventListener("end", () => textarea.setSelectionRange(0, 0));
  speechSynthesis.speak(u);
});
button2.addEventListener("click", () => {
  if (!window.speechSynthesis) return;
  speechSynthesis.cancel();
});
 </script>

I've also prepared a page where you can easily try out the code.
https://codepen.io/mikunimaru/pen/poevqKq
The above code is executable in my android browser, so I think android is capable of supporting Web Speech API.

@welcome
Copy link

welcome bot commented May 8, 2021

Hello! 👋 Thanks for logging this issue. Please remember we are all volunteers here, so some patience may be required before we can get to the issue. Also remember that the fastest way to get resolution on an issue is to propose a change directly, https://github.com/ankidroid/Anki-Android/wiki/Contributing

@krmanik krmanik added the JS API label May 8, 2021
@krmanik
Copy link
Member

krmanik commented May 8, 2021

Have you tried this one https://github.com/ankidroid/Anki-Android/wiki/FAQ#to-use-tts-on-ankidesktop-and-ankidroid
?
It is working for cloze also.

SpeechSynthesisUtterance is not supported in android webview. So, above code will not work inside AnkiDroid reviewer. May be JAVA implementation will be done to create TTS. Also after rust conversion the TTS will be improved in latest release of AnkiDroid.
https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesisUtterance

@mikunimaru
Copy link
Contributor Author

Have you tried this one https://github.com/ankidroid/Anki-Android/wiki/FAQ#to-use-tts-on-ankidesktop-and-ankidroid
?
It is working for cloze also.

SpeechSynthesisUtterance is not supported in android webview. So, above code will not work inside AnkiDroid reviewer. May be JAVA implementation will be done to create TTS. Also after rust conversion the TTS will be improved in latest release of AnkiDroid.
https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesisUtterance

In the current app specification, there is no way to achieve the behavior of speaking only the opened part of the text when cloze is opened.

Is it possible to use Chrome Custum Tab instead of WebView to enable rich JavaScript?
https://developer.chrome.com/docs/android/custom-tabs/overview/

The availability of powerful JavaScript provides great extensibility to AnkiDroid, which has no add-ons.

@krmanik
Copy link
Member

krmanik commented May 8, 2021

In the current app specification, there is no way to achieve the behavior of speaking only the opened part of the text when cloze is opened.

May be in AnkiDroid 2.15 or 2.16 the TTS will improved greatly even for cloze also.

Is it possible to use Chrome Custum Tab instead of WebView to enable rich JavaScript?
https://developer.chrome.com/docs/android/custom-tabs/overview/

I don't think so but you can submit PR.

The availability of powerful JavaScript provides great extensibility to AnkiDroid, which has no add-ons.

The js addons support will be available for AnkiDroid soon.

@mikunimaru
Copy link
Contributor Author

I've read the commentary and understood that the Web Speech API is difficult to use.
Then what about an alternative: providing an api that can freely call AnkiDroid's TTS from javascript in the template.

This api provides a lot of custom freedom with TTS functionality.

@mikunimaru
Copy link
Contributor Author

I tried to add TTS API for JS by a simple modification.
mikunimaru@b04f7cd

SVID_20210510_152829.mp4
<p><textarea id="textarea">This is TTS test.</textarea></p>
<p><button id="button1">Speak</button>

  <script> 
// CC0 http://creativecommons.org/publicdomain/zero/1.0/
var jsApi = {"version" : "0.0.1", "developer" : "[email protected]"};
var apiStatus = AnkiDroidJS.init(JSON.stringify(jsApi));

button1.addEventListener("click", () => {
 const text = document.getElementById("textarea").value
 AnkiDroidJS.speak_experimental(text);
});
 </script>

Calling TTS seems to work very well.
Please consider adding a JavaScript api. (Adding an api is too difficult for me because I don't have enough understanding of the code to estimate the changes)

@krmanik
Copy link
Member

krmanik commented May 10, 2021

You should create PR for this, you have already implemented it. But first wait for maintainers of AnkiDroid to give feedback on this features.

I have also tried code like this. It works well. You may take some ideas from it. This function in AbstractFlashcardViewer.java
The benefits of this will be custom language, tts engine, pitch and speed can be added/changed from user's js code.

 @JavascriptInterface
        public void ankiTextToSpeech(String text, String local) {
            ankiJStts = new TextToSpeech(getApplicationContext(), new TextToSpeech.OnInitListener() {
                @SuppressLint("NewApi")
                @Override
                public void onInit(int i) {
                    if (i == TextToSpeech.SUCCESS) {

                        int result = ankiJStts.setLanguage(new Locale(local));
                        if (result == TextToSpeech.LANG_MISSING_DATA || result == TextToSpeech.LANG_NOT_SUPPORTED) {
                            Timber.i("Not supported or missing language data.");

                        } else {
                            ankiJStts.speak(text, TextToSpeech.QUEUE_ADD, null, "0000000");
                        }

                    }
                }
            }, "com.google.android.tts");
        }

Usage:

<script> 
AnkiDroidJS.ankiTextToSpeech("你好!,世界", "zh_CN"); 
</script>

@krmanik
Copy link
Member

krmanik commented May 10, 2021

Also this is old message in discord for this features from @mikehardy.
that TTS stuff looks very interesting! If it works well and I wonder if we can "bend" the built-in / "core" TTS implemented in AnkiDroid to simply emit the necessary HTML/JS using that.

@mikunimaru
Copy link
Contributor Author

The video below is the cloze Speak on the user side that I first commented on.

SVID_20210510_161354.mp4
<script> 
var jsApi = {"version" : "0.0.1", "developer" : "[email protected]"};
var apiStatus = AnkiDroidJS.init(JSON.stringify(jsApi));

var cloze = document.querySelector(".cloze"); 
AnkiDroidJS.speak_experimental(cloze.textContent);
</script>

If the JavaScript TTS api is implemented, AnkiDroid will become even more powerful.
I'm a novice when it comes to JAVA, so I'd rather leave the creation of pull requests to users who are familiar with JAVA.

@mikehardy
Copy link
Member

Sure, propose a PR exposing TTS features via javascript, that's fine by me - just please make them generic so they are generally useful. I think that's a small hurdle though - I believe the only things we really do with TTS are start/stop the engine, start/stop speaking, add text to the speaking queue (?), and set a default TTS language for a deck (?)

@github-actions
Copy link
Contributor

Hello 👋, this issue has been opened for more than 2 months with no activity on it. If the issue is still here, please keep in mind that we need community support and help to fix it! Just comment something like still searching for solutions and if you found one, please open a pull request! You have 7 days until this gets closed automatically

@github-actions github-actions bot added the Stale label Jul 10, 2021
@david-allison david-allison added this to the 2.16 release milestone Nov 8, 2021
@david-allison
Copy link
Member

david-allison commented Nov 8, 2021

@mikunimaru FYI: We're in the process of accepting Anki Desktop style TTS.

/**
* Records information about a text to speech tag.
*/
data class TTSTag(
val fieldText: str,
val lang: str,
val voices: List<str>,
val speed: Float,
/** each arg should be in the form 'foo=bar' */
val otherArgs: List<str>
) : AvTag()

I believe that this JS API is compatible with this class, but flagging it up in case we need to make changes before 2.16 goes out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants