Stephen Liu wrote:
All documents on Internet printed as .ps files and/or later
converted
to .pdf have problem. Disregarding they can be retrieve and read,
their text can't be highlighted and copy/paste. I don't know what
mistake committed. Any suggestion?
Text can be stored in a PDF as a series of numbers (e.g. 65 is A) plus
font information, or as a picture of the text. Exactly how this happens
depends on how the PDF was created (and in this case, how the PostScript
version was created in the first place -- often there'll be an option
somewhere in whatever created them to create bitmaps or include font
information).
Once it's turned into a picture, then there's no easy way to go back to
the text it was created from. This isn't a limitation of your program,
but of existing technology. There are "OCR" programs that can "read"
the
text in the same way as you or I would -- they look at the shapes, and
try to recognise letters. But they aren't foolproof (or particularly
fast).
Hope this helps,
James.
--
E-mail address: james | Helpful Advice from Thames Water:
@westexe.demon.co.uk | "If you have difficulty reading this leaflet,
| please ask someone to help you."
| -- Read on "The News Quiz", BBC Radio 4