Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EXAMPLE PROJECT #7

Closed
ghost opened this issue Jan 26, 2017 · 10 comments
Closed

EXAMPLE PROJECT #7

ghost opened this issue Jan 26, 2017 · 10 comments
Assignees

Comments

@ghost
Copy link

ghost commented Jan 26, 2017

Can you make a project, of this code, that I can execute/run on Mac OS by terminal, please? I haven't familiarity with Maven.
Thanks a lot in advance.

P.S if you make this example project, say me how can run it. Thanks!

@digitalheir
Copy link
Owner

You don't need Maven strictly, you can just include the jar on the classpath.
https://github.com/digitalheir/java-probabilistic-earley-parser/releases

I only now realized you might not know how to compile Java code? I had the assumption that anyone that wants to use this project is a Java developer. So I'm quite interested in your use case, could you share?

Anyway it's a good idea to make the product runnable to anyone who at least can write a Context Free Grammar, I'll work on that at some point. It shouldn't be much work.

I don't have access to a dev device currently and I'm working full time, so I can make a project for you probably next week.

@ghost
Copy link
Author

ghost commented Jan 26, 2017

I meant if your project require some special setting to be execute. Anyway, I know how run java code :-) I'm a computer engineer. Thanks a lot! I wait your project example for the next week, if you can. Thanks again! :-)

@digitalheir
Copy link
Owner

I released a new version with usage details, check it out: https://github.com/digitalheir/java-probabilistic-earley-parser/releases

Thanks for reporting this. It forced me to run through a command-line example and fix a bug!

Let me know how it works for you. ~

@digitalheir digitalheir self-assigned this Feb 2, 2017
@ghost
Copy link
Author

ghost commented Feb 3, 2017

Hi! Your code works but there is a little problem.
If a digit:

java -jar probabilistic-earley-parser-0.9.11-jar-with-dependencies.jar -i grammar.cfg -goal S the heavy ball

The result is

0.128
└──
└── S
├── NP
│ ├── Det
│ │ └── the (the)
│ └── N
│ └── heavy (heavy)
└── VP
└── V
└── heave (heave)

In this case input string belong to grammar, but if a digit a string with one word that doesn't belong to input grammar, like:

java -jar probabilistic-earley-parser-0.9.11-jar-with-dependencies.jar -i grammar.cfg -goal S the heavy ball

(ball doesn't belong to grammar), the result is:

Exception in thread "main" java.lang.NullPointerException
at org.leibnizcenter.cfg.earleyparser.CommandLine.main(CommandLine.java:44)

There is a problem in CommandLine class.

@digitalheir
Copy link
Owner

Well, what do you expect? It's an illegal sentence, so the code throws an exception. I'll make the output a bit prettier I guess, as part of the error handling issue #5

@ghost
Copy link
Author

ghost commented Feb 3, 2017

Output would be an error such as: symbol "ball" don't belong to input CFG.
We must correct this error how explained at the end of issues #5, for example inserting the word with high probability.

@digitalheir
Copy link
Owner

digitalheir commented Feb 8, 2017

Hi Dan. Additional functionality is coming. I'll try to finish it this evening. I'll add command line options for difference scan modes, either:

  1. throw an error (strict mode as it it now, with better logging);
  2. ignore the unfound word (act as if it didn't exist);
  3. replace the unfound word with a wildcard that matches any category

It is difficult to pick a word from the lexicon, because the parser does not know (and should not know) the words that follow, so which words make a correct sentence.

For the wildcard option, you'll find the most likely category with the Viterbi parse. In post-processing you can select a random word from your lexicon, but idk if that should not be a task for this library because it seems pretty application-specific to me.

@ghost
Copy link
Author

ghost commented Feb 8, 2017

Thanks a lot! You're the best 😃

@digitalheir
Copy link
Owner

I am building the new version right now. 0.9.12 will be available soon.

You can set the parse mode to lenient using either -scanmode drop or -scanmode wildcard.

I'm still thinking about how to communicate error events following an incident like this.

I'm thinking of passing a list of Exceptions to the ParseTree object that the parser ends up with.

@ghost
Copy link
Author

ghost commented Feb 8, 2017

Thanks again!
Yes, it can be a great solution. To resolve error caused by wrong word in input string, you can substitute it with an other (of the same pruduction with the same head) with higher probability.
Have you implement "synchonizing token" method to error recovery?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant