pdftohtml encoding question[SOLVED]

François Patte francois.patte at math-info.univ-paris5.fr
Wed Mar 12 07:47:45 UTC 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Le 11.03.2008 13:40, Andras Simon a écrit :
| On 3/10/08, François Patte <francois.patte at math-info.univ-paris5.fr>
wrote:
|>
|> I am trying to convert a pdf file into html using pdftohtml provided
by f8.
|>
|> I get an html file with "nice" characters like: ’ insead of apostroph,
|> or Ã(c) instead of é...

|
| I don't, but
|
| man pdftohtml
<snip>

Thanks for answering. The problem was solved when I looked to the html
file produced: this line was missing

~    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

though the pdf file was produced from latex with utf8 encoding.

One mystery remains: why the default encoding for navigator (firefox),
or openoffice, is latin1?

Best regards
- --
François Patte
UFR de mathématiques et informatique
Université Paris Descartes
45, rue des Saints Pères
F-75270 Paris Cedex 06
Tél. +33 (0)1 44 55 35 61
http://www.math-info.univ-paris5.fr/~patte
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFH14qhdE6C2dhV2JURAmwVAJ4j7LkafoDZdAwkXmFFbCDix2FHuACeP/Kj
LT0lN+TwlWgqRhp8zJE/wyY=
=YDmM
-----END PGP SIGNATURE-----




More information about the users mailing list