Anyone know of a tasteful LGPL HTML parser in C?

Kyrre Ness Sjobak kyrre at solution-forge.net
Wed Nov 24 18:26:50 UTC 2004


ons, 24.11.2004 kl. 19.22 skrev Daniel Veillard:
> On Wed, Nov 24, 2004 at 12:33:58PM -0500, Jeff Johnson wrote:
> > I'd like to attempt to support
> >    rpm -qp http://download.fedora.redhat.com/.../*.rpm
> > within rpm by applying fnmatch(3) against parsed HTML hrefs.
> > 
> > So I'm questing existing HTML parser imp[ementations before hacking up 
> > something myself.
> 
>   libxml2 HTML parser
> 
> > The constraints on my rpm problem/implementation space are:
> >   a) must be LGPL
> 
>   MIT
> 
> >   b) must be in C.
> 
>   yes
> 
> >   c) must be reasonably small and reliable.
> 
>   if you link against the shared lib and use demand paging it's not too
>   big, otherwise it won't fit
> 
> >   d) should work on a significant variety of HTML dialects without problem.
> 
>   people have been using it to build commercial grade Web indexing software
> 
> Daniel

Aren't KHTML LGPL? I know Apple based their Safari browser on it.




More information about the devel mailing list