Convert PDF to Text?

Keith G. Robertson-Turner fedora-gmane.00003 at genesis-x.nildram.co.uk
Sat Apr 21 21:31:51 UTC 2007


I have some PDF documents that are photocopied text documents (embedded
image, rather than text glyphs). When I open these with Evince, I am
able to copy and paste the actual text. At first I though this was some
kind of OCR process, but then I realised it's actually the document
itself, which has the original text embedded in it (OCRed and embedded
during the original scan).

Is there any command I can use to extract the text from these PDF
documents in a batch? I have a couple of thousand documents that need
converting.

Just curious, since if Evince can obviously do it (manually) then the
necessary library components (at least) must be installed (FC6).

TIA.

-- 
K.
http://slated.org

.----
| I found [Vista] to be a dangerously unstable operating system,
| which has caused me to lose data ... unfortunately this product
| is unfit for any user. - [H]ardOCP, <http://tinyurl.com/3bpfs2>
`----

Fedora Core release 5 (Bordeaux) on sky, running kernel 2.6.20-1.2312.fc5
 22:29:39 up 4 days, 20:01,  3 users,  load average: 0.53, 0.48, 0.48




More information about the users mailing list