png2txt -

Craig White craigwhite at azapple.com
Mon Jun 30 15:43:41 UTC 2008


On Mon, 2008-06-30 at 16:26 +0100, Paul Smith wrote:
> On Sat, Jun 28, 2008 at 5:32 PM, Bob Goodwin USA
> <bobgoodwin at wildblue.net> wrote:
> > fred smith wrote:
> >>>>> Is there an F8 application that will convert a .png copy of a text list
> >>>>> to a text file?
> >>>>
> >>>> ----
> >>>> png is a picture file and there is no text.
> >>>>
> >>>> If you want OCR (optical character recognition - software that scans a
> >>>> picture for recognizable text and saves the recognized text to a file),
> >>>> I would suggest tesseract.
> >>>
> >>> Thanks, I will look at that.
> >>>
> >>
> >> I believe that Tesseract only understands TIF files, so you will need
> >> to convert the png before you can OCR them.
> >>
> >>
> >
> > Yes, I discovered that requirement but now I am stumped by -
> >
> >   The command line is:
> >   tesseract <image.tif> <output> [-l langid]
> >
> > I thought "-l enUS" might work but no go there.
> >
> > There's no man page, only a README and that doesn't tell me about the langid
> > other than it wants it.  Without it I get very strange looking text.
> 
> Unfortunately, the OCR programs working in Linux are not very good
> yet. In case you have access to Acrobat Professional, use it instead;
> the results are usually excellent.
----
I've never used Acrobat Professional for OCR but I have gotten excellent
results from tesseract on Linux.

OP should check out...

http://www.groklaw.net/article.php?story=20061210115516438&query=tesseract

http://www.linuxjournal.com/article/9676

Craig




More information about the users mailing list