-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #12 from MSPaintIDE/dev
2.0.0-SNAPSHOT Update
- Loading branch information
Showing
123 changed files
with
5,132 additions
and
2,634 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# Gradle files | ||
.gradle/ | ||
gradle/ | ||
gradlew | ||
gradlew.bat | ||
|
||
# IDE Generated Files | ||
.idea/ | ||
build/ | ||
out/ | ||
**.ppf | ||
|
||
# Training temp files | ||
**/resources/training_**.png | ||
database/ | ||
|
||
# Misc. | ||
**.class | ||
training.png | ||
training_**.png |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
os: windows | ||
language: shell | ||
filter_secrets: false | ||
cache: false | ||
before_install: | ||
- choco install jdk11 -params 'installdir=c:\\newocr\\jdk' -y | ||
- wget http://services.gradle.org/distributions/gradle-5.3-bin.zip | ||
- unzip -qq gradle-5.3-bin.zip -d /c/newocr/gradle | ||
- export GRADLE_HOME=/c/newocr/gradle/gradle-5.3 | ||
- export JAVA_HOME=/c/newocr/jdk | ||
- export PATH=$GRADLE_HOME/bin:$PATH | ||
- export PATH=$JAVA_HOME/bin:$PATH | ||
- set TERM=dumb | ||
- gradle -version | ||
script: | ||
- gradle clean install cleanTest test --exclude-task signArchives --no-daemon | ||
after_success: | ||
- wget https://raw.githubusercontent.com/DiscordHooks/travis-ci-discord-webhook/master/send.sh | ||
- bash send.sh success $WEBHOOK_URL | ||
after_failure: | ||
- wget https://raw.githubusercontent.com/DiscordHooks/travis-ci-discord-webhook/master/send.sh | ||
- bash send.sh failure $WEBHOOK_URL | ||
deploy: | ||
provider: script | ||
script: bash scripts/deploy.sh | ||
on: | ||
branch: master |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
# Fonts | ||
|
||
NewOCR can train almost any arbitrary font from a training file in the correct format (Some may require configuration file modification), though there are some fonts that are already configured and trained to work. Fonts can be added/removed from this list as long as they work and pass all tests. Any OCR changes are tested against these fonts, so the more fonts the less problems the OCR will have in the end. | ||
|
||
## Why is [font] not supported? | ||
|
||
There are several reasons a font may not be supported. At the current NewOCR version, fonts must contain all characters in the training image, and have **no kerning. ** Kerning is the biggest reason some fonts are not supported. Another smaller reason some fonts aren't supported is due to letters looking similar to each other. An example is in the font Arial, and the characters I, L, and |. They all look identical other than a height change in one of them, which makes it impossible for the OCR to know what is going on without context (Soon to be supported, hopefully). | ||
|
||
## Supported Fonts | ||
|
||
Just because a font is not on this list, does **not** mean it will not work! These are just the fonts that the OCR is tested against, if you have a font that works then make a PR and add its config to the repo and add it here! | ||
|
||
+ Comic Sans MS | ||
+ Monospaced | ||
+ Verdana | ||
+ Calibri | ||
+ Consolas | ||
+ Courier New | ||
## Unsupported Fonts | ||
|
||
+ Arial **Reason: Kerning/Similar characters** | ||
+ Terminal **Reason: Kerning** | ||
+ Lucidia Console **Reason: Kerning** (Need to double-check) | ||
+ Javanese Text | ||
+ Ebrima | ||
+ Montserrat **Reason: Kerning** (Around [\\]) | ||
+ OCR-A **Reason: Conjoined quotes** <small>hmmm... ironic</small> | ||
+ Myanmar Text **Reason: Kerning** | ||
+ Bahnschrift Light Condensed **Reason: vertical lines misrecognition** | ||
+ Ink Free **Reason: Kerning** |
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,91 +1,30 @@ | ||
<a href="https://discord.gg/RXmPkPJ"> | ||
<img src="https://img.shields.io/discord/528423806453415972.svg?logo=discord" | ||
alt="NewOCR and MS Paint IDE's Discord server"> | ||
</a> | ||
<div> | ||
<a href="https://search.maven.org/artifact/com.uddernetworks.newocr/NewOCR/"> | ||
<img alt="Maven Central" src="https://maven-badges.herokuapp.com/maven-central/com.uddernetworks.newocr/NewOCR/badge.svg"> | ||
</a> | ||
<a href="https://discord.gg/RXmPkPJ"> | ||
<img src="https://img.shields.io/discord/528423806453415972.svg?logo=discord" | ||
alt="NewOCR and MS Paint IDE's Discord server"> | ||
</a> | ||
<a href="https://travis-ci.org/MSPaintIDE/NewOCR/"> | ||
<img alt="Travis (.org) branch" src="https://img.shields.io/travis/RubbaBoy/NewOCR/dev.svg"> | ||
</a> | ||
</div> | ||
|
||
# NewOCR | ||
NewOCR is an OCR library made to suit [MS Paint IDE](https://github.com/RubbaBoy/MSPaintIDE)'s needs, though can be used in any project, as nothing is made specific to the IDE. The OCR can be trained with many fonts, though is geared towards fonts like **Verdana** and similar fonts. Other fonts _may_ require some tweaking of the character detector, but the main detection will work with no matter how different the characters are from Verdana (Hell you could modify it to work with emojis). | ||
NewOCR is an OCR library made to suit [MS Paint IDE](https://github.com/MSPaintIDE/MSPaintIDE)'s needs, though can be used in any project, as nothing is made specific to the IDE. The OCR can be trained with many fonts, though is geared towards fonts like **Verdana** and similar fonts. Other fonts _may_ require some tweaking of the character detector, but the main detection will work with no matter how different the characters are from Verdana (You could even modify it to work with emojis). | ||
|
||
## How it works | ||
### Summary | ||
NewOCR uses a super sketchy method of detecting characters, which in short breaks up each character into different subsections, then gets the percentage of filled in pixels each section contains, and puts them into an array. It then gets the closest matching array, which is decided as the closest pixel. | ||
Currently, NewOCR is being tested against the following fonts: | ||
|
||
### Sectioning | ||
Each letter is broken up into 16 sections. These aren't pixel-based, but percentage based. This allows them to be created on all sized letters with the same proportions. | ||
- Comic Sans MS | ||
- Monospaced | ||
- Verdana | ||
- Calibri | ||
- Consolas | ||
- Courier New | ||
|
||
First, the letter is horizontally broken up into top and bottom sections. Then, each of those two sections are broken up vertically into another two sections. The remaining sections are broken up into diagonal sections, with their diagonals angling towards the center of the character. A visual of what the sections look like and their index of the value array (Will be used later) can be found here: | ||
![Section examples 1](/images/E1.png) | ||
Though you can train the OCR on many, many other fonts. For more information on fonts used and how they are chosen, see the [fonts page](Fonts.md). | ||
|
||
After that process has occurred, the second sectioning process starts. This one is more simple, in that it first horizontally separates it into thirds, then those sections into vertical thirds. The sections and their indices look like the following: | ||
![Section examples 2](/images/E2.png) | ||
To get started with using NewOCR or get a detailed description of how every piece of the OCR works from start to finish, you can visit the wiki here: [https://wiki.newocr.dev/](https://wiki.newocr.dev/) | ||
|
||
### Applying the sections | ||
|
||
After the sections and their indices have been established, the system gets the percentage the pixels are black (Rather than white, as it's effectively binary image). Applied to our sections, this is what the values for sections of the letter **E** would look like (Depending on the size, these values may vary from your results): | ||
![Section values 1](/images/Eval1.png) | ||
![Section values 2](/images/Eval2.png) | ||
|
||
With the indices applied, the value array would be: | ||
``` | ||
[0.86, 0.51, 0.46, 0.48, 0.46, 0.67, 0.43, 0.09, 0.77, 0.37, 0.37, 0.77, 0.36, 0.36, 0.77, 0.37, 0.37] | ||
``` | ||
|
||
These values are then compared to the averaged out trained characters' data, and the closest match is given. Other things that affect its similarity to the trained database character are the width/height ratio, which helps distinguish characters like `_` and `-`. Some type meta can also be attached to the database character, but still has the percentage values stored. These meta values are things like if it had to append chunks of pixels together in such a way it has to be a percentage sign, if it appended pixels to the top of a base character (`!`, `i`, `j`), to a bottom of a character (`!`), and some others. The enum containing these values may be found here: [LetterMeta.java](/src/main/java/com/uddernetworks/newocr/LetterMeta.java). | ||
|
||
### Training | ||
A vital part in the OCR is its training. Though many OCRs require training for their Neural Networks, NewOCR uses a simple, fast method of training involving essentially averaging values form charcaters. | ||
|
||
The OCR starts off with a generated image of all the characters it can take advantage of through the [TrainGenerator](/src/main/java/com/uddernetworks/newocr/TrainGenerator.java) class, taking up fonts from an upper to lower bound. The system gets the character bounds for every character, then incrementally goes through the characters, putting the segmented percentages described above into a database, after averaging all the font sizes together. This is also done with the width and height of the character, for increased accuracy. The accuracy of the character segmentation is crucial in this step, as if one character is detected as say 2, it will throw off the entire line, resulting in a useless training data set. | ||
|
||
With scaling fonts to smaller sizes where they get deformed by their pixelation, their percentages may be significantly different than the higher resolution variants. To circumvent this, the database is broken up into different sections of font bounds, e.g. from font size 0-12 values will be places together, 13-20, and 20+ will be grouped together. The bounds' values and count may be changed in the program. | ||
|
||
Example of a training image: | ||
![Training image](/images/training.png) | ||
|
||
## Using It | ||
NewOCR is on Central, so it's insanely easy to get on both Maven and Gradle. | ||
|
||
Gradle: | ||
```Groovy | ||
compile 'com.uddernetworks.newocr:NewOCR:1.2.1' | ||
``` | ||
|
||
Maven: | ||
```XML | ||
<dependency> | ||
<groupId>com.uddernetworks.newocr</groupId> | ||
<artifactId>NewOCR</artifactId> | ||
<version>1.2.1</version> | ||
</dependency> | ||
``` | ||
|
||
### Creating the training image | ||
The OCR needs an image to base all its font data off of, so a training image is required. The class `TrainGenerator.java` has the ability to create such images, and you can just change `UPPER_FONT_BOUND` and `LOWER_FONT_BOUND` to the maximum and minimum fonts to be created in the image. After running the program, you should have an image similar to the one displayed above in [Training](#Training). | ||
|
||
Currently the font `Verdana` is the only font tested to work with the character recogniser, though if character detection was modified/improved (Planned for the future) it could easily detect many more fonts with high accuracy. | ||
|
||
### Setting up the database | ||
To use NewOCR, a MySQL database is required. This is to store all the section data of each character. To run by the example usage in `Main.java`, you will need to put the database's URL, username, and password as the program arguments in their respective orders. An example of this would be: | ||
```java -jar NewOCR-1.2.1.java "jdbc:mysql://127.0.0.1:3306/OCR" "my_user" "my_pass"``` | ||
You will _not_ be required to run any queries manually once you have created a table for the OCR; the program will do that for you. | ||
|
||
Before you do anything with detecting characters you must train the OCR. It does not use any Neural Networks as shown in the explanation above, but it needs to register how the font works. In order to get this working in `Main.java`, make sure in the main method you have `new Main().run(args)` uncommented, and that more down the file that `new File("training.png")` and `new File("HWTest.png")` points to valid paths, the first one being the training image as described above, and then your input image. When you run the program, type `yes` when it asks if you want to train, and then wait a minute or so. When the program exits, you should be able to run it again, answer `no` to that question, and after a few seconds it should give its output. | ||
|
||
### System properties used | ||
NewOCR uses a few system properties for some extra options for debugging and other things. Here is a list of them (More may be added in the future): | ||
- **newocr.rewrite** [Boolean] - Rewrites the image to a new BufferedImage before it's scanned. This could fix some weird encoding issues happening in the past | ||
- **newocr.error** [Boolean] - If the system should output certain problems it thinks may have occurred (NOT stacktraces, those are always shown) | ||
- **newocr.debug** [Boolean] - If the system should display some certain debug messages used in the program | ||
|
||
## Resources | ||
The following papers were used as inspiration, ideas, knowledge gathering, whatever it may be towards the advancement of this OCR. I could have forgotten a few research papers, I read a lot of them. They might just be stuff I thought was really cool related to the subject, I'm generalizing this description to hell so I won't have to change it later. | ||
|
||
- https://www.researchgate.net/publication/260405352_OPTICAL_CHARACTER_RECOGNITION_OCR_SYSTEM_FOR_MULTIFONT_ENGLISH_TEXTS_USING_DCT_WAVELET_TRANSFORM | ||
- https://core.ac.uk/download/pdf/20643247.pdf | ||
- https://www.researchgate.net/publication/321761298_Generalized_Haar-like_filters_for_document_analysis_application_to_word_spotting_and_text_extraction_from_comics | ||
- https://pdfs.semanticscholar.org/c8b7/804abc030ee93eff2f5baa306b8b95361c57.pdf | ||
- http://www.frc.ri.cmu.edu/~akeipour/downloads/Conferences/ICIT13.pdf | ||
- https://support.dce.felk.cvut.cz/mediawiki/images/2/24/Bp_2017_troller_milan.pdf | ||
- http://www.cs.toronto.edu/~scottl/research/msc_thesis.pdf | ||
- https://www.researchgate.net/publication/258651794_Novel_Approach_for_Baseline_Detection_and_Text_Line_Segmentation | ||
- https://www.researchgate.net/publication/2954700_Neural_and_fuzzy_methods_in_handwriting_recognition | ||
To view javadocs on the project, you can go here: [https://docs.newocr.dev/](https://docs.newocr.dev/) |
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -23,7 +23,7 @@ apply plugin: 'io.codearte.nexus-staging' | |
|
||
group 'com.uddernetworks.newocr' | ||
archivesBaseName = "NewOCR" | ||
version '1.2.1' | ||
version '2.0.0-SNAPSHOT' | ||
|
||
sourceCompatibility = 11 | ||
|
||
|
@@ -32,8 +32,13 @@ repositories { | |
} | ||
|
||
dependencies { | ||
testCompile group: 'junit', name: 'junit', version: '4.12' | ||
testImplementation('org.junit.jupiter:junit-jupiter:5.4.2') | ||
testCompile group: 'org.apache.commons', name: 'commons-math3', version: '3.6.1' | ||
testCompile group: 'org.slf4j', name: 'slf4j-log4j12', version: '1.7.25' | ||
|
||
compile group: 'org.slf4j', name: 'slf4j-api', version: '1.7.25' | ||
|
||
compile group: 'org.bitbucket.cowwoc', name: 'diff-match-patch', version: '1.2' | ||
|
||
compile group: 'com.zaxxer', name: 'HikariCP', version: '2.7.8' | ||
compile group: 'mysql', name: 'mysql-connector-java', version: '5.1.6' | ||
|
@@ -43,12 +48,26 @@ dependencies { | |
|
||
// https://mvnrepository.com/artifact/it.unimi.dsi/fastutil | ||
compile group: 'it.unimi.dsi', name: 'fastutil', version: '8.2.2' | ||
|
||
compile group: 'com.typesafe', name: 'config', version: '1.3.3' | ||
} | ||
|
||
ext.moduleName = 'NewOCR' | ||
|
||
tasks.withType(Test) { | ||
maxParallelForks = 1 | ||
} | ||
|
||
test { | ||
minHeapSize = "1024m" | ||
maxHeapSize = "61446m" | ||
forkEvery = 1 | ||
} | ||
|
||
javadoc { | ||
options.addStringOption('-module-path', classpath.asPath) | ||
source = sourceSets.main.allJava | ||
classpath = configurations.compile | ||
} | ||
|
||
compileJava { | ||
|
@@ -61,6 +80,13 @@ compileJava { | |
} | ||
} | ||
|
||
test { | ||
useJUnitPlatform() | ||
testLogging { | ||
events "passed", "skipped", "failed" | ||
} | ||
} | ||
|
||
nexusStaging { | ||
if (project.hasProperty("ossrhUser") && project.hasProperty("ossrhPassword")) { | ||
username = ossrhUser | ||
|
@@ -86,59 +112,62 @@ artifacts { | |
archives javadocJar, sourcesJar | ||
} | ||
|
||
allprojects { | ||
apply plugin: 'signing' | ||
apply plugin: 'maven' | ||
apply plugin: 'signing' | ||
apply plugin: 'maven' | ||
|
||
// Signature of artifacts | ||
signing { | ||
sign configurations.archives | ||
} | ||
// Signature of artifacts | ||
signing { | ||
sign configurations.archives | ||
} | ||
|
||
// OSSRH publication | ||
uploadArchives { | ||
repositories { | ||
mavenDeployer { | ||
// POM signature | ||
beforeDeployment { MavenDeployment deployment -> signing.signPom(deployment) } | ||
|
||
if (project.hasProperty("ossrhUser") && project.hasProperty("ossrhPassword")) { | ||
// Target repository | ||
repository(url: "https://oss.sonatype.org/service/local/staging/deploy/maven2/") { | ||
authentication(userName: ossrhUser, password: ossrhPassword) | ||
} | ||
// -Prelease uploadArchives closeAndPromoteRepository -x demo:uploadArchives | ||
|
||
// OSSRH publication | ||
uploadArchives { | ||
repositories { | ||
mavenDeployer { | ||
// POM signature | ||
beforeDeployment { MavenDeployment deployment -> signing.signPom(deployment) } | ||
|
||
if (project.hasProperty("ossrhUser") && project.hasProperty("ossrhPassword")) { | ||
// Target repository | ||
String repo = version.toString().endsWith("-SNAPSHOT") ? | ||
"https://oss.sonatype.org/content/repositories/snapshots" : | ||
"https://oss.sonatype.org/service/local/staging/deploy/maven2" | ||
println "Using repository ${repo}" | ||
repository(url: repo) { | ||
authentication(userName: ossrhUser, password: ossrhPassword) | ||
} | ||
} | ||
|
||
pom.project { | ||
name 'NewOCR' | ||
description 'NewOCR is a library for simple but efficient OCR detection in pure Java.' | ||
packaging 'jar' | ||
url 'https://github.com/RubbaBoy/NewOCR' | ||
pom.project { | ||
name 'NewOCR' | ||
description 'NewOCR is a library for simple but efficient OCR detection in pure Java.' | ||
packaging 'jar' | ||
url 'https://github.com/MSPaintIDE/NewOCR' | ||
|
||
scm { | ||
connection 'scm:git:https://github.com/RubbaBoy/NewOCR.git' | ||
developerConnection 'scm:git:[email protected]:RubbaBoy/NewOCR.git' | ||
url 'https://github.com/RubbaBoy/NewOCR.git' | ||
} | ||
scm { | ||
connection 'scm:git:https://github.com/MSPaintIDE/NewOCR.git' | ||
developerConnection 'scm:git:[email protected]:MSPaintIDE/NewOCR.git' | ||
url 'https://github.com/MSPaintIDE/NewOCR.git' | ||
} | ||
|
||
licenses { | ||
license { | ||
name 'The MIT License (MIT)' | ||
url 'http://opensource.org/licenses/MIT' | ||
distribution 'repo' | ||
} | ||
licenses { | ||
license { | ||
name 'The MIT License (MIT)' | ||
url 'http://opensource.org/licenses/MIT' | ||
distribution 'repo' | ||
} | ||
} | ||
|
||
developers { | ||
developer { | ||
id = 'RubbaBoy' | ||
name = 'Adam Yarris' | ||
email = '[email protected]' | ||
} | ||
developers { | ||
developer { | ||
id = 'RubbaBoy' | ||
name = 'Adam Yarris' | ||
email = '[email protected]' | ||
} | ||
} | ||
} | ||
} | ||
} | ||
|
||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
apply plugin: 'java' | ||
|
||
repositories { | ||
mavenLocal() | ||
mavenCentral() | ||
} | ||
|
||
dependencies { | ||
compile project(':') | ||
|
||
compile group: 'org.slf4j', name: 'slf4j-api', version: '1.7.25' | ||
compile group: 'org.slf4j', name: 'slf4j-log4j12', version: '1.7.25' | ||
} |
Oops, something went wrong.