Converting html to text ?!?

Patrick O'Callaghan pocallaghan at gmail.com
Fri Mar 5 15:42:26 UTC 2010


On Fri, 2010-03-05 at 09:51 -0500, William Case wrote:
> Hi;
> 
> I have been using Notecase for a couple of years.  I would like to keep
> it but it seems it is no longer being maintained as open source.
> 
> ∴ I would like to use Lyx as my large note taker or draft writer
> application.  The problem is Notecase uses an *.ncd suffix which is
> unrecognized by Lyx.  When I examine the files I want to import from
> Notecase to Lyx they are designated in the header as:
> 
>         <!DOCTYPE NoteCase-File>
>         <!--LastNote:41-->
>         <HTML>
>         <HEAD>
>         <meta content="text/html;charset=UTF-8"
>         http-equiv="Content-Type">
>         <meta name="generator" content="NoteCase 1.6.1">
>         <TITLE></TITLE>
>         ... etc.
> 
> The markup is almost certainly html.
> 
> How can I convert these files into *.txt?  I have tried several
> variations of:
> 
>         ]$ html2text -o ~/UMLC.txt file:///home/bill/NoteCaseDocs/UMLC.
>         *Cannot open input file "file:///home/bill/NoteCaseDocs/UMLC.*".
> 
> Any suggestions greatly appreciated on how to import these (I have
> several) *.ncd files into Lyx.

For a smallish number of files, the easiest is probably to open them in
Firefox and use Save As ... text.

poc



More information about the users mailing list