Skip to content

Latest commit

 

History

History
127 lines (125 loc) · 3.94 KB

Data-Files-in-different-versions.md

File metadata and controls

127 lines (125 loc) · 3.94 KB

Languages supported in different versions of Tesseract

This does not include script traineddata files added later.

Lang Code Language 3.00 3.02 3.04 4.0.0
afr Afrikaans x x x
amh Amharic x x
ara Arabic x x x
asm Assamese x x
aze Azerbaijani x x
aze_cyrl Azerbaijani - Cyrilic x x x
bel Belarusian x x x
ben Bengali x x x
bod Tibetan x x
bos Bosnian x x
bre Breton x
bul Bulgarian x x x
cat Catalan; Valencian x x x
ceb Cebuano x x
ces Czech x x x
chi_sim Chinese - Simplified x x x
chi_tra Chinese - Traditional x x x
chr Cherokee x x x
cym Welsh x x
dan Danish x x x
dan_frak Danish - Fraktur (contrib) x x
deu German x x x
deu_frak German - Fraktur (contrib) x x
dzo Dzongkha x x
ell Greek, Modern (1453-) x x x
eng English x x x x
enm English, Middle (1100-1500) x x x
epo Esperanto x x x
equ Math / equation detection module x x
est Estonian x x x
eus Basque x x x
fas Persian x x
fin Finnish x x x
fra French x x x
frk German - Fraktur x x x
frm French, Middle (ca.1400-1600) x x x
gle Irish x x
glg Galician x x x
grc Greek, Ancient (to 1453) (contrib) x x x
guj Gujarati x x
hat Haitian; Haitian Creole x x
heb Hebrew x x x
hin Hindi x x x
hrv Croatian x x x
hun Hungarian x x x
iku Inuktitut x x
ind Indonesian x x x
isl Icelandic x x x
ita Italian x x x
ita_old Italian - Old x x x
jav Javanese x x
jpn Japanese x x x
kan Kannada x x x
kat Georgian x x
kat_old Georgian - Old x x
kaz Kazakh x x
khm Central Khmer x x
kir Kirghiz; Kyrgyz x x
kmr Kurmanji (Kurdish - Latin Script) x
kor Korean x x x
kor_vert Korean (vertical) x
kur Kurdish (Arabic Script) x
kur_ara Kurdish (Arabic Script) x
lao Lao x x
lat Latin x x
lav Latvian x x x
lit Lithuanian x x x
ltz Luxembourgish x
mal Malayalam x x x
mar Marathi x x
mkd Macedonian x x x
mlt Maltese x x x
mon Mongolian x
mri Maori x
msa Malay x x x
mya Burmese x x
nep Nepali x x
nld Dutch; Flemish x x x
nor Norwegian x x
oci Occitan (post 1500) x x
ori Oriya x x
osd Orientation and script detection module x x x x
pan Panjabi; Punjabi x x
pol Polish x x x
por Portuguese x x x
pus Pushto; Pashto x x
que Quechua x
ron Romanian; Moldavian; Moldovan x x x
rus Russian x x x
san Sanskrit x x
sin Sinhala; Sinhalese x x
slk Slovak x x x
slk_frak Slovak - Fraktur (contrib) x x
slv Slovenian x x x
snd Sindhi x
spa Spanish; Castilian x x x
spa_old Spanish; Castilian - Old x x x
sqi Albanian x x x
srp Serbian x x x
srp_latn Serbian - Latin x x
sun Sundanese x
swa Swahili x x x
swe Swedish x x x
syr Syriac x x
tam Tamil x x x
tat Tatar x
tel Telugu x x x
tgk Tajik x x
tgl Tagalog x x x
tha Thai x x x
tir Tigrinya x x
ton Tonga x
tur Turkish x x x
uig Uighur; Uyghur x x
ukr Ukrainian x x x
urd Urdu x x
uzb Uzbek x x
uzb_cyrl Uzbek - Cyrilic x x
vie Vietnamese x x x
yid Yiddish x x
yor Yoruba x