Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Russian LMentry tasks #18

Open
vvchernov opened this issue Nov 1, 2023 · 5 comments
Open

Russian LMentry tasks #18

vvchernov opened this issue Nov 1, 2023 · 5 comments
Assignees

Comments

@vvchernov
Copy link

Create new tasks which are similar with Lmentry tasks, but queries, answer patterns are translated to Russian with language specific.
How to create new task can be see on example

@eskondrashova
Copy link

Collected some data from different datasets and other resources. The examples and description are now available in my local repo: https://github.com/eskondrashova/lmentry/tree/main/resources_ru

I'm going to correct the tasks by the end of the week using collected data and after that do a pull request.

@eskondrashova
Copy link

I collected some data from this site: https://www.ros-edu.ru/basic-dictionary. Words in Russian are grouped here by levels (A1-C2). For my assignment, I only used words from levels A1-B1. I then processed the data using mystem to determine the parts of speech of the words. After this, dictionaries consisting of 1) a word, 2) its possible parts of speech (if there are several) and 3) its level, were placed in one json file.

However, at first I did not notice that there were “words” like “во время”, “до свидания”, etc., which in fact are not words, but phrases. And words like "летом", "осенью", etc., which in fact, if lemmatized, are the words "лето", "осень", etc.

I will resolve this issue during the day.

@eskondrashova
Copy link

@eskondrashova
Copy link

85285e6

@vvchernov
Copy link
Author

PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants