Skip to content

Commit

Permalink
Replace Wiki links by local references
Browse files Browse the repository at this point in the history
Signed-off-by: Stefan Weil <sw@weilnetz.de>
  • Loading branch information
stweil committed Feb 6, 2020
1 parent 5fdb9b6 commit 43bf42c
Show file tree
Hide file tree
Showing 27 changed files with 150 additions and 141 deletions.
6 changes: 3 additions & 3 deletions 4.0-with-LSTM.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Tesseract 4.0 **+** source code is available in the 'master' branch of the [repo

## Documentation
* [NeuralNetsInTesseract4.00](NeuralNetsInTesseract4.00)
* [VGSLSpecs](https://github.com/tesseract-ocr/tesseract/wiki/VGSLSpecs)
* [VGSLSpecs](VGSLSpecs.md)
* [VGSLSpecs info from Tensorflow](https://github.com/mldbai/tensorflow-models/blob/master/street/g3doc/vgslspecs.md)
* [DAS 2016 tutorial slides](https://github.com/tesseract-ocr/docs/tree/master/das_tutorial2016)
Slides
Expand All @@ -13,11 +13,11 @@ Slides
[#7](https://github.com/tesseract-ocr/docs/blob/master/das_tutorial2016/7Building%20a%20Multi-Lingual%20OCR%20Engine.pdf)
have information about LSTM integration in Tesseract 4.0.

* [4.0 Accuracy and Performance](https://github.com/tesseract-ocr/tesseract/wiki/4.0-Accuracy-and-Performance)
* [4.0 Accuracy and Performance](4.0-Accuracy-and-Performance.md)

## Training Tesseract LSTM engine

* [TrainingTesseract 4.00](https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00)
* [TrainingTesseract 4.00](TrainingTesseract-4.00.md)

* [tess4training - LSTM Training Tutorial for Tesseract 4](https://github.com/Shreeshrii/tess4training)

Expand Down
2 changes: 1 addition & 1 deletion 4.0x-Changelog.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#### [Click here for release notes from version 1.0 in 2006 to current development](https://github.com/tesseract-ocr/tesseract/wiki/ReleaseNotes).
#### [Click here for release notes from version 1.0 in 2006 to current development](ReleaseNotes.md).

#### See below for complete changelog from Jan 2015 to Jul 2019 (4.1 Release)

Expand Down
5 changes: 3 additions & 2 deletions 4.0x-Common-Errors-and-Resolutions.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
## Errors during LSTM Training

See Ray Smith's notes in [TrainingTesseract 4.00](https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00), specifically the section on [errors](https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00#error-messages-from-training).
See Ray Smith's notes in [TrainingTesseract 4.00](TrainingTesseract-4.00.md),
specifically the section on [errors](TrainingTesseract-4.00.md#error-messages-from-training).

## Errors related to Tesseract API

Expand All @@ -20,7 +21,7 @@ right locale settings and fail if that is not the case.
!strcmp(locale, "C"):Error:Assert failed:in file baseapi.cpp, line 192
```

You have to find out whether "C" works with your code or you must restore
You have to find out whether "C" works with your code or you must restore
the original locale after calling the Tesseract API.


Expand Down
14 changes: 8 additions & 6 deletions APIExample-user_patterns.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,23 @@
User patterns can be useful when recognizing ID type of fields which have non-dictionary words but follow specific patterns of alphabets and digits e.g. `\A\A\d\d\d\d\A` or `\A\A\d\d\d\A`

This wiki page provides a simple example on how to use the tesseract-ocr API (4.x) in C++ for applying _user patterns_ for improving recognition. It is expected that tesseract-ocr is correctly installed including all dependencies.
This documentation provides a simple example on how to use the tesseract-ocr API
(4.x) in C++ for applying _user patterns_ for improving recognition.
It is expected that tesseract-ocr is correctly installed including all dependencies.
It is expected the user is familiar with C++, compiling and linking program on their platform.

This is based on [an example provided in tesseract-ocr forum](https://groups.google.com/forum/#!msg/tesseract-ocr/y052O_DwYic/gsJN1NHBfqkJ) and updated for the [recent implementation of the feature for tesseract 4.x](https://github.com/tesseract-ocr/tesseract/pull/2328).

Please note that while this example gets 100% accuracy after user_patterns are applied, that may not always be the case. User patterns (like user dictionaries) are merely applied as a _hint_ while decoding, but not exclusively. [Pre-processing the image](https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality) usually improves the quality of recognition and is recommended.
Please note that while this example gets 100% accuracy after user_patterns are applied, that may not always be the case. User patterns (like user dictionaries) are merely applied as a _hint_ while decoding, but not exclusively. [Pre-processing the image](ImproveQuality.md) usually improves the quality of recognition and is recommended.

## Requirements

In order to apply user patterns for improving recognition, the following are required.

### _user patterns file_

The _user patterns file_ should contain one pattern per line in UTF-8 format. In choosing which patterns to include please be aware of the fact that providing very generic patterns will make tesseract run slower. Best results may be obtained by having a single pattern in the file.
The _user patterns file_ should contain one pattern per line in UTF-8 format. In choosing which patterns to include please be aware of the fact that providing very generic patterns will make tesseract run slower. Best results may be obtained by having a single pattern in the file.

Details of type of patterns that can be used are given in [trie.h](https://github.com/tesseract-ocr/tesseract/blob/master/src/dict/trie.h#L185).
Details of type of patterns that can be used are given in [trie.h](https://github.com/tesseract-ocr/tesseract/blob/master/src/dict/trie.h#L185).

#### Example of a user patterns file

Expand Down Expand Up @@ -46,7 +48,7 @@ user_patterns_file path/to/my.patterns
In the following, let's assume you named that config file `path/to/my.patterns.config`.

## CLI Example

From the command line, user patterns can be invoked as follows:

```sh
Expand All @@ -68,7 +70,7 @@ The following code uses the above _user patterns file_ and _config file_ on that
int main()
{
Pix *image;
char *outText;
char *outText;
char *configs[]={"path/to/my.patterns.config"};
int configs_size = 1;
tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
Expand Down
12 changes: 6 additions & 6 deletions APIExample.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# API examples

This wiki provides simple examples on how to use the tesseract-ocr API (v3.02.02-4.0.0) in C++.
This documentation provides simple examples on how to use the tesseract-ocr API (v3.02.02-4.0.0) in C++.
It is expected that tesseract-ocr is correctly installed including all dependencies.
It is expected the user is familiar with C++, compiling and linking program on their platform, [though basic compilation examples are included for beginners with Linux](#compiling-c-api-programs-on-linux).

Expand Down Expand Up @@ -184,7 +184,7 @@ int main()
api->Recognize(0);
tesseract::PageIteratorLevel level = tesseract::RIL_WORD;
tesseract::ResultIterator* res_it = api->GetIterator();
// Get confidence level for alternative symbol choices. Code is based on
// Get confidence level for alternative symbol choices. Code is based on
// https://github.com/tesseract-ocr/tesseract/blob/master/src/api/hocrrenderer.cpp#L325-L344
std::vector<std::vector<std::pair<const char*, float>>>* choiceMap = nullptr;
if (res_it != 0) {
Expand Down Expand Up @@ -270,7 +270,7 @@ Notice the different confidence values for:
<span class='ocr_glyph' id='choice_1_4_14' title='x_confs 1'>r</span>
<span class='ocr_glyph' id='choice_1_4_15' title='x_confs 1'>e</span>
</span>
```
```


# Compiling C++ API programs on Linux
Expand Down Expand Up @@ -348,8 +348,8 @@ int TessBaseAPIInit3(TessBaseAPI* handle, const char* datapath, const char* lang
void TessBaseAPISetImage2(TessBaseAPI* handle, struct Pix* pix);
BOOL TessBaseAPIDetectOrientationScript(TessBaseAPI* handle, char** best_script_name,
int* best_orientation_deg, float* script_confidence,
BOOL TessBaseAPIDetectOrientationScript(TessBaseAPI* handle, char** best_script_name,
int* best_orientation_deg, float* script_confidence,
float* orientation_confidence);
""")

Expand Down Expand Up @@ -444,7 +444,7 @@ int main()
bool textonly = false;
int jpg_quality = 92;
tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
if (api->Init(datapath, "eng")) {
fprintf(stderr, "Could not initialize tesseract.\n");
exit(1);
Expand Down
2 changes: 1 addition & 1 deletion AddOns.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
For GUI interface to Tesseract and other 3rd Party projects, please see [User Projects - 3rd Party](https://github.com/tesseract-ocr/tesseract/wiki/User-Projects-%E2%80%93-3rdParty)
For GUI interface to Tesseract and other 3rd Party projects, please see [User Projects - 3rd Party](User-Projects-%E2%80%93-3rdParty.md)

# External tools, wrappers and training projects for Tesseract

Expand Down
4 changes: 1 addition & 3 deletions Command-Line-Usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ See [Running Tesseract](https://github.com/tesseract-ocr/tesseract/wiki#running-

## FAQ

See [FAQ](https://github.com/tesseract-ocr/tesseract/wiki/FAQ#running-tesseract) for more examples and tips.
See [FAQ](FAQ.md#running-tesseract) for more examples and tips.

--------------------------------------------

Expand Down Expand Up @@ -260,5 +260,3 @@ Output
517

531

2 changes: 1 addition & 1 deletion Compiling-–-GitInstallation.md
Original file line number Diff line number Diff line change
Expand Up @@ -273,4 +273,4 @@ Example (Run the fuzzer to find new bugs):
nice bin/fuzzer/fuzzer-api -jobs=16 -workers=16

## Building using Windows Visual Studio
See [Compiling for Windows](https://github.com/tesseract-ocr/tesseract/wiki/Compiling#windows).
See [Compiling for Windows](Compiling.md#windows).
20 changes: 10 additions & 10 deletions Compiling.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
# Compilation guide for various platforms

**Note:** This wiki expects you to be familiar with compiling software on your operation system.
**Note:** This documentation expects you to be familiar with compiling software on your operation system.

*Use the same tools for building tesseract as you used for [building leptonica](https://github.com/DanBloomberg/leptonica/issues/410).*

## Table of contents
* [Linux](https://github.com/tesseract-ocr/tesseract/wiki/Compiling#linux)
* [Windows](https://github.com/tesseract-ocr/tesseract/wiki/Compiling#windows)
* [macOS](https://github.com/tesseract-ocr/tesseract/wiki/Compiling#macos)
* [Android](https://github.com/tesseract-ocr/tesseract/wiki/Compiling#android)
* [Common Errors](https://github.com/tesseract-ocr/tesseract/wiki/Compiling#common-errors)
* [Miscellaneous](https://github.com/tesseract-ocr/tesseract/wiki/Compiling#miscellaneous)
* [Linux](Compiling.md#linux)
* [Windows](Compiling.md#windows)
* [macOS](Compiling.md#macos)
* [Android](Compiling.md#android)
* [Common Errors](Compiling.md#common-errors)
* [Miscellaneous](Compiling.md#miscellaneous)

## Linux
To install **Tesseract 4.x** you can simply run the following command on your **Ubuntu 18.xx bionic**:
Expand Down Expand Up @@ -81,7 +81,7 @@ Note that if building Leptonica from source, you may need to ensure that /usr/lo

## Installing Tesseract from Git

Please follow instructions in [https://github.com/tesseract-ocr/tesseract/wiki/Compiling--GitInstallation](https://github.com/tesseract-ocr/tesseract/wiki/Compiling-%E2%80%93-GitInstallation)
Please follow instructions in [Compiling--GitInstallation](Compiling-%E2%80%93-GitInstallation.md)

Also read [Install Instructions](https://github.com/tesseract-ocr/tesseract/blob/master/INSTALL.GIT.md)

Expand Down Expand Up @@ -122,7 +122,7 @@ export PKG_CONFIG_PATH=$HOME/local/lib/pkgconfig

## Language Data

* Download the [data file(s) for the language(s) you are interested in](https://github.com/tesseract-ocr/tesseract/wiki/Data-Files).
* Download the [data file(s) for the language(s) you are interested in](Data-Files.md).
* Move it to the `tessdata` directory (e.g. 'mv tessdata $TESSDATA\_PREFIX' if defined `TESSDATA_PREFIX`)

You can also use:
Expand All @@ -138,7 +138,7 @@ to point to your tessdata directory (example: if your tessdata path is '/usr/loc

#### Using Tesseract

**!!! IMPORTANT !!!** To use Tesseract in your application (to include tess or to link it into your app) see this very simple example https://github.com/tesseract-ocr/tesseract/wiki/User-App-Example.
**!!! IMPORTANT !!!** To use Tesseract in your application (to include tess or to link it into your app) see this very simple [example](User-App-Example.md).

#### Build the latest library (using SW)

Expand Down
2 changes: 1 addition & 1 deletion Data-Files-in-tessdata_fast.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ When using the traineddata files from the **`tessdata_best`** and **`tessdata_fa
Community contributed traineddata files can be found at:

* [tessdata_contrib](https://github.com/tesseract-ocr/tessdata_contrib) repo
* [Wiki page with links to externals repos](https://github.com/tesseract-ocr/tesseract/wiki/Data-Files-Contributions)
* [Wiki page with links to externals repos](Data-Files-Contributions.md)

## Information specific to tessdata_fast

Expand Down
8 changes: 4 additions & 4 deletions Documentation.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
## Technical Documentation

[Technical Papers and Presentations](https://github.com/tesseract-ocr/tesseract/wiki/Technical-Documentation)
[Technical Papers and Presentations](Technical-Documentation.md)

## Tesseract 4.0 with LSTM

For information about the new LSTM based tesseract engine, please see [these wiki pages](https://github.com/tesseract-ocr/tesseract/wiki/4.0-with-LSTM).
For information about the new LSTM based tesseract engine, please see the [documentation](4.0-with-LSTM.md).

## Manual Pages (3.0x)

Expand All @@ -24,7 +24,7 @@ plus description of [unicharambigs](https://github.com/tesseract-ocr/tesseract/b

## Changes to Tesseract

* [Release Notes](https://github.com/tesseract-ocr/tesseract/wiki/ReleaseNotes)
* [Release Notes](ReleaseNotes.md)
* [Change Log](https://github.com/tesseract-ocr/tesseract/blob/master/ChangeLog)

## API/ABI Changes Review
Expand All @@ -46,4 +46,4 @@ Documentation of tesseract generated from source code as of July 2015 by [doxyge

## FAQ

[Frequently Asked Questions](https://github.com/tesseract-ocr/tesseract/wiki/FAQ)
[Frequently Asked Questions](FAQ.md)
4 changes: 2 additions & 2 deletions Downloads.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,12 @@ Tesseract is included in most Linux distributions.

## Binaries for Windows

https://github.com/tesseract-ocr/tesseract/wiki/4.0-with-LSTM#400-alpha-for-windows
https://tesseract-ocr.github.io/tessdoc/4.0-with-LSTM.html#400-alpha-for-windows

### Old Downloads

[Downloads Archive on SourceForge](http://sourceforge.net/projects/tesseract-ocr-alt/files/).
There you can find, among other files, Windows installer for the **old** version 3.02.
There you can find, among other files, Windows installer for the **old** version 3.02.

Currently, there is no **official** Windows installer for newer versions.

Expand Down
Loading

0 comments on commit 43bf42c

Please sign in to comment.