F14
I saved a text letter with gimp or xsane as a tif file and used tesseract to covert it to txt,
Run this command to convert;
tesseract out.tif out.txt
And I got this output, what gives ?
(9se91d ‘.l9AO) _[T`l:].I9A&iO 1SOLU;S‘EO!l9UIV_ 30 11ed'Bu19q sg '1s9q 9L11"_s1g9u9q_19qw9wT}1N 1no7§_11n30 ‘Xa/uns .|9qwaW7§.|o°V}1N'MA\M 3u111s1A Aq 9u11u0 d1L1s19qLu9Lu 100/{ 9p1z13dn pun A9/uns mol( 9191dLu0o O1 S1 u011d0 19L110uV 'Aep019d019Au9 pglad-93121s0d p9s010u9 9L11 ug 1u9u1/(ed 1n0/{ LII!/¥\ A9/uns 9111 u1n191 ‘u9ql '/(9/uns ,IHOK uo xoq SEL; 9111 >109L10 1sn[ ‘S€l1[ABS 11au0111ppe 9/\19091 pun 193uo1 d1qs19qLu9Lu mo/K PUQIXQ ‘10 330 gg 103 SIQQUSQ 12913 9Lues SLI) pun QUIZBBBIU 12913 QUIES 911130 199/{ IBUOWPPR ue 193 11‘n0/(»93um19 111m 3u1111 2 10N 'SQIQABS gg 2 IB d1L1s19qu19Lu mo/( PUQIXQ uno n0A ‘sfilap 05 1x9u 9L11 103 ‘puv '1e91( 19d ggg /{11v1u10u S1 d1L1s19q1u91/\1 VHN 'Ogg 1u0 103 1L1s19qu191/\1 V}1N 1n0 01,1129 9 ppn O1 11un1_10 0 [2109 s QA 12 no 9/x1 01 9>111 P11 ‘KGAJUS 19qLu91/\1 1211u9pgu0Q 1n0K 919[dLUO3 01 911111 9L11 3u1>1v1 103 n0K >1ueL11 0 113 iMOU)[ 9m 191 pun /(9/uns 19qu191,\1 IBQUQPQUOQ mol( Q19[dU.lO0 9sra91(1 ,;s19u11ed VHN 111015 s19_110 91qnn1nA 9/09091 01 9>111 no/{ p1n0 M LS9)1B1Sd99MS 1n0 191u9 O1 93111 no/( p1n0M [,SJ9119[-9 9913 SLVHN 01 9q110sqns 01 91111 n0A p1noM ‘1noqe 910111 u11a91 O1 91111 pgnol( SIQQUSQ 9s9L11 30 119111/A 9u1 1191 11‘n0& ‘A9A1ng 19qLu91,\1 IEIIIISPHUOD 1n0A gU!l9[dI.LIO0 Ag 's191191sM9u 9913 pun S19330 \B[39dS ‘s1g9u9q mau 30 su9zop QAEQDSJ Ol p91111u9 919 noA ‘19q1u9Lu e sv '1{'2P()19I1I O11[ u1n191 pun K9/uns 19qu191,\1 IBQIUQPQUOQ p9s019u9 9111 919[dll109 O1 IUQUIOIII 12 9>1n1 9S‘l29[d 01 n0A gLI§)[SB Lu‘1 ‘91q1ss0d sn 3u1p19m91 se 99U9!J9dX9 d111s19qLu9Lu V351 mol( 9312111 sn d19L1 Ol ‘1s9nb91 [B}99dS 2 9AEL[ OSI? 1 'A119q11 u1e:>119LuV O1 1u9Lu11Luw09 &1eu1p101a11x9 mol( 30 s10qLu/(S se 1(1pno1d 1e99p 1n0/{ A1a1ds1p pun pmo 1n0& M129 9s291¢1 ')[0l'LI1 10 120 mol( 103 11a99p [EIOQO pun 13120 d1L1s19qu191u m9u 1n0A 3u1pn1ou1 ‘S1§3U9C[ d1L1s19qLu9Lu VHN mo/{ 30 M93 12 1sn[911a 191191 S!L[1 q11m p9s01oug ‘suue 1'e9q pun d99>1 01 111311 IUSUIPUQLIIV pu099g 1n0 pu939p 01 9111211 3u103u0 1n0 U1 d1L1s19pe911n0/K 103 111191213 912 9m pun ‘LU'291ll'\O uo no/{ 9/uaq Ol p910u0L1 912 9 M 'VXN OJ,1I19lI,I1[l1II1I03 mol( 103 noi( >1u2L11 O11LI'9M 1 ‘SJSQLUQLII /001193 1n0 30 31m19q u() ‘19qw91/\1 V}1N M0119 :1 11e9(1 9.u91(1e'1 aufielm Juapywd .9191 .mg/nysxq nga# :ago 0ao'V}11\1~/A/1/x\ o€ozz VINIUHIA ‘xv1u1v:1 (IVOH 1'u1,\1 SEYIJVAA 051 1 1 v:>mawV 10 Nouvmossv 311111 1vN01.v.vN 39, ...,. ,» _ NHLCI fswau; 1 1. I., FQ _ 11, ~ 1;1 1" -1 15 01 ~ g
On 04/26/2011 03:00 PM, james tate wrote:
F14
I saved a text letter with gimp or xsane as a tif file and used tesseract to covert it to txt,
Run this command to convert;
tesseract out.tif out.txt
And I got this output, what gives ?
(9se91d ‘.l9AO) _[T`l:].I9A&iO 1SOLU;S‘EO!l9UIV_ 30 11ed'Bu19q sg '1s9q 9L11"_s1g9u9q_19qw9wT}1N 1no7§_11n30 ‘Xa/uns .|9qwaW7§.|o°V}1N'MA\M 3u111s1A Aq 9u11u0 d1L1s19qLu9Lu 100/{ 9p1z13dn pun A9/uns mol( 9191dLu0o O1 S1 u011d0 19L110uV 'Aep019d019Au9 pglad-93121s0d p9s010u9 9L11 ug 1u9u1/(ed 1n0/{ LII!/¥\ A9/uns 9111 u1n191 ‘u9ql '/(9/uns ,IHOK uo xoq SEL; 9111>109L10 1sn[ ‘S€l1[ABS 11au0111ppe 9/\19091 pun 193uo1 d1qs19qLu9Lu mo/K PUQIXQ ‘10 330 gg 103 SIQQUSQ 12913 9Lues SLI) pun QUIZBBBIU 12913 QUIES 911130 199/{ IBUOWPPR ue 193 11‘n0/(»93um19 111m 3u1111 2 10N 'SQIQABS gg 2 IB d1L1s19qu19Lu mo/( PUQIXQ uno n0A ‘sfilap 05 1x9u 9L11 103 ‘puv '1e91( 19d ggg /{11v1u10u S1 d1L1s19q1u91/\1 VHN 'Ogg 1u0 103 1L1s19qu191/\1 V}1N 1n0 01,1129 9 ppn O1 11un1_10 0 [2109 s QA 12 no 9/x1 01 9>111 P11 ‘KGAJUS 19qLu91/\1 1211u9pgu0Q 1n0K 919[dLUO3 01 911111 9L11 3u1>1v1 103 n0K>1ueL11 0 113 iMOU)[ 9m 191 pun /(9/uns 19qu191,\1 IBQUQPQUOQ mol( Q19[dU.lO0 9sra91(1 ,;s19u11ed VHN 111015 s19_110 91qnn1nA 9/09091 01 9>111 no/{ p1n0 M LS9)1B1Sd99MS 1n0 191u9 O1 93111 no/( p1n0M [,SJ9119[-9 9913 SLVHN 01 9q110sqns 01 91111 n0A p1noM ‘1noqe 910111 u11a91 O1 91111 pgnol( SIQQUSQ 9s9L11 30 119111/A 9u1 1191 11‘n0& ‘A9A1ng 19qLu91,\1 IEIIIISPHUOD 1n0A gU!l9[dI.LIO0 Ag 's191191sM9u 9913 pun S19330 \B[39dS ‘s1g9u9q mau 30 su9zop QAEQDSJ Ol p91111u9 919 noA ‘19q1u9Lu e sv '1{'2P()19I1I O11[ u1n191 pun K9/uns 19qu191,\1 IBQIUQPQUOQ p9s019u9 9111 919[dll109 O1 IUQUIOIII 12 9>1n1 9S‘l29[d 01 n0A gLI§)[SB Lu‘1 ‘91q1ss0d sn 3u1p19m91 se 99U9!J9dX9 d111s19qLu9Lu V351 mol( 9312111 sn d19L1 Ol ‘1s9nb91 [B}99dS 2 9AEL[ OSI? 1 'A119q11 u1e:>119LuV O1 1u9Lu11Luw09&1eu1p101a11x9 mol( 30 s10qLu/(S se 1(1pno1d 1e99p 1n0/{ A1a1ds1p pun pmo 1n0& M129 9s291¢1 ')[0l'LI1 10 120 mol( 103 11a99p [EIOQO pun 13120 d1L1s19qu191u m9u 1n0A 3u1pn1ou1 ‘S1§3U9C[ d1L1s19qLu9Lu VHN mo/{ 30 M93 12 1sn[911a 191191 S!L[1 q11m p9s01oug ‘suue 1'e9q pun d99>1 01 111311 IUSUIPUQLIIV pu099g 1n0 pu939p 01 9111211 3u103u0 1n0 U1 d1L1s19pe911n0/K 103 111191213 912 9m pun ‘LU'291ll'\O uo no/{ 9/uaq Ol p910u0L1 912 9 M 'VXN OJ,1I19lI,I1[l1II1I03 mol( 103 noi(>1u2L11 O11LI'9M 1 ‘SJSQLUQLII /001193 1n0 30 31m19q u() ‘19qw91/\1 V}1N M0119 :1 11e9(1 9.u91(1e'1 aufielm Juapywd .9191 .mg/nysxq nga# :ago 0ao'V}11\1~/A/1/x\ o€ozz VINIUHIA ‘xv1u1v:1 (IVOH 1'u1,\1 SEYIJVAA 051 1 1 v:>mawV 10 Nouvmossv 311111 1vN01.v.vN 39, ...,. ,» _ NHLCI fswau; 1
- I., FQ _
11, ~ 1;1 1" -1 15 01 ~ g
This is the exact command i typed in;
$ tesseract Untitled.tif Untitled Tesseract Open Source OCR Engine with LibTiff
On Wed, Apr 27, 2011 at 5:00 AM, james tate binarynut@comcast.net wrote:
And I got this output, what gives ?
Wild stab, but you might need a language pack? Also, I think tesseract only works with TIFFs of certain bit level and of certain compression.
-c
On 04/26/2011 07:15 PM, Chris Smart wrote:
On Wed, Apr 27, 2011 at 5:00 AM, james tatebinarynut@comcast.net wrote:
And I got this output, what gives ?
Wild stab, but you might need a language pack? Also, I think tesseract only works with TIFFs of certain bit level and of certain compression.
-c
what language pack, there is no english pack in the fedora repo.
On Wed, Apr 27, 2011 at 9:29 AM, james tate binarynut@comcast.net wrote:
what language pack, there is no english pack in the fedora repo.
I don't know about how Fedora packages it - like I said, it's a wild stab in the dark :-)
From: http://code.google.com/p/tesseract-ocr/
"Important Download Information: The language data files are separate from the code! See the ReadMe wiki for installation and usage information!"
-c
On Tue, 2011-04-26 at 19:29 -0400, james tate wrote:
On 04/26/2011 07:15 PM, Chris Smart wrote:
On Wed, Apr 27, 2011 at 5:00 AM, james tatebinarynut@comcast.net wrote:
And I got this output, what gives ?
Wild stab, but you might need a language pack? Also, I think tesseract only works with TIFFs of certain bit level and of certain compression.
-c
what language pack, there is no english pack in the fedora repo.
---- it's been a number of years since I used tesseract but I recall that it was relatively simple to get operational.
1. Did you run the test TIFF image to verify it worked first before you tossed your own TIFF image at it?
2. What does 'file' tell you about the image file you are trying OCR?
Craig