Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include necessary files in template #10

Open
reynoldsnlp opened this issue Feb 16, 2021 · 4 comments
Open

Include necessary files in template #10

reynoldsnlp opened this issue Feb 16, 2021 · 4 comments

Comments

@reynoldsnlp
Copy link
Contributor

This issue may actually belong to gut. I am able to ./autogen.sh/./configure/make in giellalt/template-lang-und without errors, but when I tried to do it right off the bat with lang-rue and other fresh repositories, I get errors. Somehow the new repositories are initialized in a broken state.

make[2]: Leaving directory '/Users/robertreynolds/gt/lang-rue/src/orthography'
Making all in cg3
make[2]: Entering directory '/Users/robertreynolds/gt/lang-rue/src/cg3'
make[2]: *** No rule to make target 'dependency.cg3', needed by 'dependency.bin'.  Stop.
make[2]: Leaving directory '/Users/robertreynolds/gt/lang-rue/src/cg3'
make[1]: *** [Makefile:1187: all-recursive] Error 1
make[1]: Leaving directory '/Users/robertreynolds/gt/lang-rue/src'
make: *** [Makefile:538: all-recursive] Error 1
@reynoldsnlp
Copy link
Contributor Author

dependency.cg3 and functions.cg3 are required by giella-core, but they are not present in this repository. To complicate matters further, .gitignore contains the following:

/src/cg3/dependency.cg3
/src/cg3/functions.cg3

Very strange that files that giella-core explicitly requires are in .gitignore.

@snomos snomos changed the title Include necessary files in Include necessary files in template Mar 24, 2021
@snomos
Copy link
Member

snomos commented Mar 24, 2021

The idea is that these files are automatically copied from giella-shared, since they tend to be rather language independent. @Trondtr knows more about these files.

@Trondtr
Copy link
Contributor

Trondtr commented Mar 24, 2021

This setup goes back to a presentation we had in 2010:
Antonsen, L., Wiechetek, L. and T. Trosterud 2010: Reusing Grammatical Resources for New Languages. In Proceedings of the International conference on Language Resources and Evaluation LREC 2010. p. 2782–2789. ISBN 2-9517408-6-7. Stroudsburg: The Association for Computational Linguistics. http://www.lrec-conf.org/proceedings/lrec2010/pdf/254_Paper.pdf
where we showed that we were able to add functions and dependencies to North and Lule Sámi, Faroese and Greenlandic, with the same grammars. Thus we have included them.
What I now do for e.g. Baltic Finnish languages is that I do not use the common functions.cg3, but I do use the dependency.cg3. For the dependency this is actually quite clear: If you are an object pointing to a transitive verb to your right (@obj>), then all that is left for the dependency grammar is to pick the nearest transitive verb as your mother. Assigning the object tag in the first place (functions.cg3) is a bit less language independent.

Another issue is of course that dependency analysis of running text is not a topic for the first year of work on language X anyway. But still, when one gets there one would wish for a nice entrance.

A possibility could be to:

  • have a script for converting the fst tags to cg3 preamble, i.e. from
    +N +Nom +Sg +Pl
    to
    LIST N = N ;
    LIST Nom = Nom ;
    etc.

  • setup a dummy disambiguator.cg3 file with two rules for case disambiguation, two for number, for person, ...

  • and a dummy functions.cg3 file with (some more) rules for mapping major functions (SUBJ, OBJ, ADVL, N>, ...)

  • and a setup for using the script to get all the fst tags installed at the beginning of the cg3 files. One might perhaps even use the INCLUDE command (and include the tags from a generated tag file at runtime).

  • The dependency file could then be held as-is.

@TinoDidriksen
Copy link
Member

* have a script for converting the fst tags to cg3 preamble, i.e. from
  +N +Nom +Sg +Pl
  to
  LIST N = N ;
  LIST Nom = Nom ;
  etc.

As per https://visl.sdu.dk/cg3/chunked/tags.html#list-tags this could be simplified to a single line:
LIST-TAGS += N Nom Sg Pl etc ;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants