Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Estonian language support #688

Closed
mangotree123 opened this issue Dec 17, 2024 · 6 comments · Fixed by #740
Closed

Estonian language support #688

mangotree123 opened this issue Dec 17, 2024 · 6 comments · Fixed by #740
Labels
languages Dictionary or language related issues

Comments

@mangotree123
Copy link

Hi! I've been using tt9 for a while and it's fantastic. Any way support for Estonian could be added?
Thanks!

@sspanak
Copy link
Owner

sspanak commented Dec 19, 2024

Of course. If you'd like to help, you can give me a hand with searching a good word list/dictionary. If there is a respected university or academy, or institute that regulates the language and has issued something like "the big dictionary of Estonian", I could make a use of it. Such dictionaries are spell-checked and contain a lot of different word forms which results in very high quality suggestions. Since it is difficult for me to search in Estonian websites, I would appreciate if you could do this for me.

@sspanak sspanak added the languages Dictionary or language related issues label Dec 19, 2024
@sspanak sspanak added the more info needed Further information is requested label Dec 30, 2024
@mangotree123
Copy link
Author

Hi, sorry for the late response, I was away from home for the holidays. The official Estonian dictionary is ÕS, but I can't seem to find a downloadable version of it. The official way to access it online is through https://sonaveeb.ee but it functions like an online dictionary.

@sspanak
Copy link
Owner

sspanak commented Jan 5, 2025

Hi, sorry for the late response, I was away from home for the holidays.

All good!

The official Estonian dictionary is ÕS, but I can't seem to find a downloadable version of it. The official way to access it online is through https://sonaveeb.ee/ but it functions like an online dictionary.

They seem to have an API. I can try using the /api/word/details/{wordId} endpoint. It provides a lot of nice info, including the word frequency. I am just not sure if I will be able to register and get a key from https://ekilex.ee/register

@sspanak sspanak removed the more info needed Further information is requested label Jan 5, 2025
@mangotree123
Copy link
Author

It seems like they don't limit access much, at least registering is as simple as any account creation process.
Not quite so sure about the API access, but it seems pretty straightforward too.
If needed, I could try to make the request on your behalf.

@sspanak
Copy link
Owner

sspanak commented Feb 4, 2025

Sure, I don't mind getting a hand.

The sample API response looks quite complex. It would be great if you could identify which parts of it contain words, word variants, synonyms and whatnot, but not foreign words. I'd prefer collecting as many as possible and then, if they happen to be too many, say above 1 million, I can easily filter them by frequency of use.

@sspanak
Copy link
Owner

sspanak commented Feb 20, 2025

@mangotree123, I was able to extract the words and added Estonian to TT9. I will include it in the next version. But before that, I would appreciate if you test it and let me know if anything is missing or if you notice any misspelled words. Here is an APK for you to try it out.

@sspanak sspanak linked a pull request Feb 20, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
languages Dictionary or language related issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants