Prepare for release

ropensci · Jan 26, 2018 · 3e272f1 · 3e272f1
1 parent 9d57df2
commit 3e272f1
Showing 4 changed files with 20 additions and 9 deletions.
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -1,7 +1,7 @@
 Package: tesseract
 Type: Package
-Title: Open Source OCR Engine for R
-Version: 1.7.9000
+Title: Open Source OCR Engine
+Version: 1.8
 Author: Jeroen Ooms
 Maintainer: Jeroen Ooms <[email protected]>
 Description: Bindings to 'Tesseract': An OCR engine with unicode (UTF-8) support

diff --git a/R/tesseract.R b/R/tesseract.R
@@ -4,7 +4,7 @@
 #' are reading. Works best for images with high contrast, little noise and horizontal text.
 #'
 #' Tesseract uses training data to perform OCR. Most systems default to English
-#' training data. To improve OCR performance for other langauges you can to install the
+#' training data. To improve OCR performance for other languages you can to install the
 #' training data from your distribution. For example to install the spanish training data:
 #'
 #'  - [tesseract-ocr-spa](https://packages.debian.org/testing/tesseract-ocr-spa) (Debian, Ubuntu)

diff --git a/README.md b/README.md
@@ -81,12 +81,23 @@ brew install tesseract
 ```
 
 Tesseract uses training data to perform OCR. Most systems default to English
-training data. To improve OCR performance for other langauges you can to install the
-training data from your distribution. For example to install the spanish training data:
+training data. To improve OCR results for other langauges you can to install the
+appropriate training data. On Windows and OSX you can do this in R using 
+`tesseract_download()`:
+
+
+```r
+tesseract_download('fra')
+```
+
+On Linux you need to install the appropriate training data from your distribution. 
+For example to install the spanish training data:
 
   - [tesseract-ocr-spa](https://packages.debian.org/testing/tesseract-ocr-spa) (Debian, Ubuntu)
   - [tesseract-langpack-spa](https://apps.fedoraproject.org/packages/tesseract-langpack-spa) (Fedora, EPEL)
 
-On other platforms you can manually download training data from [github](https://github.com/tesseract-ocr/tessdata)
-and store it in a path on disk that you pass in the `datapath` parameter. Alternatively
-you can set a default path via the `TESSDATA_PREFIX` environment variable.
+Alternatively you can manually download training data from [github](https://github.com/tesseract-ocr/tessdata)
+and store it in a path on disk that you pass in the `datapath` parameter or set a default path via the
+`TESSDATA_PREFIX` environment variable. Note that the Tesseract 4 and Tesseract 3 use different 
+training data format. Make sure to download training data from the branch that matches your libtesseract version.
+
diff --git a/man/tesseract.Rd b/man/tesseract.Rd