EXAMPLE PROJECT #7

ghost · 2017-01-26T17:46:19Z

Can you make a project, of this code, that I can execute/run on Mac OS by terminal, please? I haven't familiarity with Maven.
Thanks a lot in advance.

P.S if you make this example project, say me how can run it. Thanks!

digitalheir · 2017-01-26T21:52:58Z

You don't need Maven strictly, you can just include the jar on the classpath.
https://github.com/digitalheir/java-probabilistic-earley-parser/releases

I only now realized you might not know how to compile Java code? I had the assumption that anyone that wants to use this project is a Java developer. So I'm quite interested in your use case, could you share?

Anyway it's a good idea to make the product runnable to anyone who at least can write a Context Free Grammar, I'll work on that at some point. It shouldn't be much work.

I don't have access to a dev device currently and I'm working full time, so I can make a project for you probably next week.

ghost · 2017-01-26T22:34:13Z

I meant if your project require some special setting to be execute. Anyway, I know how run java code :-) I'm a computer engineer. Thanks a lot! I wait your project example for the next week, if you can. Thanks again! :-)

digitalheir · 2017-02-02T21:17:09Z

I released a new version with usage details, check it out: https://github.com/digitalheir/java-probabilistic-earley-parser/releases

Thanks for reporting this. It forced me to run through a command-line example and fix a bug!

Let me know how it works for you. ~

ghost · 2017-02-03T14:20:06Z

Hi! Your code works but there is a little problem.
If a digit:

java -jar probabilistic-earley-parser-0.9.11-jar-with-dependencies.jar -i grammar.cfg -goal S the heavy ball

The result is

0.128
└──
└── S
├── NP
│ ├── Det
│ │ └── the (the)
│ └── N
│ └── heavy (heavy)
└── VP
└── V
└── heave (heave)

In this case input string belong to grammar, but if a digit a string with one word that doesn't belong to input grammar, like:

java -jar probabilistic-earley-parser-0.9.11-jar-with-dependencies.jar -i grammar.cfg -goal S the heavy ball

(ball doesn't belong to grammar), the result is:

Exception in thread "main" java.lang.NullPointerException
at org.leibnizcenter.cfg.earleyparser.CommandLine.main(CommandLine.java:44)

There is a problem in CommandLine class.

digitalheir · 2017-02-03T14:28:22Z

Well, what do you expect? It's an illegal sentence, so the code throws an exception. I'll make the output a bit prettier I guess, as part of the error handling issue #5

ghost · 2017-02-03T16:11:04Z

Output would be an error such as: symbol "ball" don't belong to input CFG.
We must correct this error how explained at the end of issues #5, for example inserting the word with high probability.

digitalheir · 2017-02-08T15:02:07Z

Hi Dan. Additional functionality is coming. I'll try to finish it this evening. I'll add command line options for difference scan modes, either:

throw an error (strict mode as it it now, with better logging);
ignore the unfound word (act as if it didn't exist);
replace the unfound word with a wildcard that matches any category

It is difficult to pick a word from the lexicon, because the parser does not know (and should not know) the words that follow, so which words make a correct sentence.

For the wildcard option, you'll find the most likely category with the Viterbi parse. In post-processing you can select a random word from your lexicon, but idk if that should not be a task for this library because it seems pretty application-specific to me.

ghost · 2017-02-08T15:27:24Z

Thanks a lot! You're the best 😃

digitalheir · 2017-02-08T21:11:22Z

I am building the new version right now. 0.9.12 will be available soon.

You can set the parse mode to lenient using either -scanmode drop or -scanmode wildcard.

I'm still thinking about how to communicate error events following an incident like this.

I'm thinking of passing a list of Exceptions to the ParseTree object that the parser ends up with.

ghost · 2017-02-08T21:47:52Z

Thanks again!
Yes, it can be a great solution. To resolve error caused by wrong word in input string, you can substitute it with an other (of the same pruduction with the same head) with higher probability.
Have you implement "synchonizing token" method to error recovery?

digitalheir closed this as completed Feb 2, 2017

digitalheir self-assigned this Feb 2, 2017

digitalheir mentioned this issue Feb 11, 2017

ERROR CHECKING IMPLEMENTATION #5

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EXAMPLE PROJECT #7

EXAMPLE PROJECT #7

ghost commented Jan 26, 2017

digitalheir commented Jan 26, 2017

ghost commented Jan 26, 2017

digitalheir commented Feb 2, 2017

ghost commented Feb 3, 2017

digitalheir commented Feb 3, 2017

ghost commented Feb 3, 2017

digitalheir commented Feb 8, 2017 •

edited

Loading

ghost commented Feb 8, 2017

digitalheir commented Feb 8, 2017

ghost commented Feb 8, 2017

EXAMPLE PROJECT #7

EXAMPLE PROJECT #7

Comments

ghost commented Jan 26, 2017

digitalheir commented Jan 26, 2017

ghost commented Jan 26, 2017

digitalheir commented Feb 2, 2017

ghost commented Feb 3, 2017

digitalheir commented Feb 3, 2017

ghost commented Feb 3, 2017

digitalheir commented Feb 8, 2017 • edited Loading

ghost commented Feb 8, 2017

digitalheir commented Feb 8, 2017

ghost commented Feb 8, 2017

digitalheir commented Feb 8, 2017 •

edited

Loading