-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2.0.0-SNAPSHOT Update #12
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…head. Lower resolution fonts will now be less accurate/not recommended, at the benefit of a most likely significant speed boost.
…e database for reading them
… things like apostrophe data, starting to make the OCR work again
…testing with natural images still needs to be done
…some Javadocs updates and API improvements, and some light optimizations.
… OCR results across all characters, though more prevalent on special characters
… it though. The main breaking point currently is punctuation; all alphanumeric characters are perfect with 100% accuracy on the training set. A code cleanup will be done in the future, I just need to come back to a working state tomorrow.
A code cleanup is very much needed, and is not ready for production. Implementing dot detection and horizontal separation may need to be done in the future for more accurate overall tracking.
…d heavily clean up code
…es with periods and apostrophes
…s for some reason
…he accuracy is 99.74%)
Currently set to only one character mismatch, but will be fine tuned soon for everything
…cess fully automatic and fix 4 problematic characters
…the error correction
… natural images or handwriting is planned
…ion. Also added SLF4J
…ngs to training options and added more documentation
…most fonts Check the TODO for a more in-depth explanation of plans
The current method may need to be reworked though, to first detect the characters and then piece them together, similar to apostrophes.
… pieced together in a more OOP-manner
Still need to clean up code, and add tests
…rate than before THere's a few oddities in which I need to fix, and after adding tests and maybe adding to the API a bit it should be ready for v2 release
Now can proceed to adding more tests and improving API
This is a separate commit because there was a LOT of stuff to remove and it's easy to accidentally delete something (I had to dig for some stuff accidentally deleted while testing this commit)
Worked fine with Monospaced but not with Comic Sans (Referred to as CMS in the code to anyone wondering), then I """fixed""" some stuff and it switched. It now passes tests, but both have some things to touch up. Also adds new tests and ability to easily add more. Now still riddled with debug messages, but will be fixed (hopefully) soon.
Forgot what font I was using as a demo so I'm using Monospaced again
…es and LetterMeta as they were unused
…uning of things like calculated spacing and look-alike characters Docs and examples coming soon
… for more dynamic similarities based off of configurations instead of hardcoded classes
…e config The only issues that I've been testing with are _ being detected as - in Monospaced, and two i's are being detected as | in Comic Sans
I decided not to merge classes, because of the two abstract classes all characters extend, it cleans up their code heavily and still allows for necessary data separation. This also cleans up the general code a bunch, one of which is making tests require a 98% or higher success rate. Once testing is done and I have some people review it, it is release-ready.
A little bit of code cleanup is needed
I'm not really sure if this is how travis works with Java projects so this may require multiple commits to finish
This was 69 commits squashed into this single one :(
…e trained check, and more
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changed basically everything about how the OCR works, including proper character detection, other font abilities, a very clean API, travis support, and more. Will be promoted to release once MS Paint IDE is complete to find any outstanding bugs that may still be present.